Research Background

My research background is in computer vision and machine learning. I wrote my Ph.D. thesis on using reinforcement learning to control image understanding processes back in 1993, under the guidance of Ed Riseman and Allen Hanson and with advice from Andy Barto. From 1993-1996 I worked on unmanned ground vehicles (what we would now call self-driving cars) under the DARPA UGV program. I then moved to Colorado State University, where I co-directed the Computer Vision Lab with my colleague and friend Ross Beveridge from 1996-2019. The lab initially concentrated on outdoor color image analysis and machine learning applied to computer vision (which was a rarity back then), but we branched out. We worked on computational models of human vision, during which time I had the joy of doing a sabbatical with Stephen Kosslyn in the Harvard Psychology Department. We worked closely with Jonathan Phillips at NIST and became a leader in the statistical evaluation of face recognition algorithms. With one of our students, David Bolme (now at ORNL), we revolutionized the field of visual object tracking. We worked with colleagues in high performance computing, notably Wim Bohm and Walid Najjar, to map image processing algorithms onto FPGAs. And we applied early machine learning to video action recognition with Trevor Darrell at UC Berkeley and Chris Geyer at iRobot (see ‘Action Recognition: 2012″ below).

From 2015-2019 we focused the lab on non-linguistic human-AI interaction, with funding under the DARPA Communicating with Computers (CwC) program. We combined visual emotion, gesture, and body pose recognition with an avatar to create two-way interactive non-verbal communication during joint human and AI exercises (see “Communicating with Computers: 2018” below). In 2019 I started my tour at DARPA and started the Perceptually-enhanced Task Guidance (PTG) program, with the goal of using augmented reality to enhance human performance in real-world tasks through close, personalized human-AI teaming. I also ran programs on defenses against adversarial AI and on the high-dimensional geometry of machine learning.

My work on PTG led me to work on human upskilling, which leads me to…

VIGIL: Upskilling Technicians for Rural Medicine: 2025

With support from the ARPA-H PARADIGM project, we (Nikhil Krishnaswamy, Sarath Sreedharan, Nathaniel Blanchard and myself) are part of the VIGIL project headed by Jason Corso at the University of Michigan. The goal of the project is to improve rural medical health outcomes by bringing medical care directly to patient’s homes. While other performers are building vans equipped to carry a wide array of medical technology, our focus in on solving medicine’s “thirst for labor”. Urban hospitals are staffed with specialized technicians. There are three types of ultrasound technicians (one each for legs, cardiac, and obstetrics), x-ray technicians, CAT-scan technicians, phlebotomists (to draw blood), and many other specialists, all of whom specialize in a particular procedure or piece of equipment. We can get the equipment to the patient, but if no one knows how to use the equipment it won’t do any good, and we can’t fit a dozen specialists in the van even if we could hire them. Our goal is to create the technology to upskill a single generalist technicians to perform whatever procedure a patient needs when the patient needs it. To learn more, clink on the YouTube link below!

https://youtu.be/HQehN62XEDg

Communicating with Computers: 2018

This video is from a live demonstration on June 11, 2018. It shows a person directing an avatar to build block structures (in this case, a staircase). The user (Rahul) can gesture and speak; the avatar (Diana) can gesture, speak, and move blocks.Ther user’s goal is to build the block structure; the avatar’s goal is to teach the user how to use the system more effectively. These two goals are related, but not the same. At the beginning of the run, Diana (the avatar) doesn’t know what Guru (the user) knows. So she spends a lot if time asking Guru for confirmation and showing him alternate ways to accomplish goals. For example, she shoes him that “yes” can be spoken or signaled through a “thumbs up” gestures. As the trial progresses, Guru’s knowledge (and Diana’s knowledge of Guru’s knowledge) improves, and blocks are moved faster. He does have a little problem with the small purple block at the end, however.

The demonstration was joint work with James Pustejovsky and his lab at Brandeis (who created Diana and her reasoning system), and Jaime Ruiz and his lab at the University of Florida (who elicited the gesture set from naive users). CSU built the real-time gesture recognition system.

Action Recognition : 2012

Play the video below to see an example of earlier work in action recognition. This was joint work in the summer of 2012 with iRobot and U.C. Berkeley as part of the DARPA Mind’s Eye project. (Many thanks to Dr. Christopher Geyer and the folks at iRobot for producing this video).