A patient walks into a doctor's office, complaining of a sore throat. After an initial exam, the doctor orders a culture and a few blood tests that she uses — along with the exam results and professional judgment — to make a fairly objective diagnosis.
But what about patients struggling with mental illnesses? There's no blood test for depression or anxiety, and diagnosis is often based on more subjective criteria.
Language Technologies Assistant Professor L.-P. Morency wants to change that.
A computer enthusiast from as early as age 12, Morency has always appreciated not just the act of programming, but the logical way of thinking it requires. As he matured and worked toward his undergraduate degree, his interest in logical thinking led to research on computer systems that could recognize objects or people. That effort, combined with a natural interest in psychology and how humans interact with one another, contributed to a lofty career goal: to build algorithms and programs that understand how people communicate and will allow computers to understand subtle human communicative behaviors during social interactions.
"I want to understand how people express themselves through language and gestures, and I want to understand them together," he said. "There's a lot of great work in analyzing speech and gesture, but there's been little work that puts them together."
Morency's research into multimodal algorithms that analyze speech and gesture has taken the form of SimSensei, a virtual interviewer designed to help clinicians diagnose psychological distress. Outfitted with a Microsoft Kinect augmented with facial recognition technology and depth-sensing cameras, SimSensei "interviews" patients while analyzing their smile, gaze and fidgeting behavior. This data is then provided to doctors, who can use it as a more objective way to identify depression and anxiety. (Check out the video below.)
MultiSense, the sensing and analysis technologies Morency created during the SimSensei project, covers a multidisciplinary range of topics, including multimodal interaction, social psychology, computer vision, machine learning and artificial intelligence. This natural inclination toward multidisciplinarity brought him to the LTI.
"The LTI is the perfect place for me to forward my goal for building multimodal algorithms that bring together speech and gestures — like facial expressions, posture and reciprocity — because of it's established expertise in speech and language," he said.
He was also drawn to CMU because of how tightly multidisciplinary collaboration is woven into the culture.
"When looking at other wonderful universities, CMU stood out because multidisciplinary research is its anchor," he said. "It's at the foundation of the university."
Morency joined the LTI this semester from the University of Southern California's MultiComp Lab, where he was the director. He earned his master's and doctor's degrees in computer science from the Massachusetts Institute of Technology in 2002 and 2006, respectively, and a bachelor's degree in computer science from Laval University in 2000. He's received seven best paper awards, and the NetExplo 2014 Award, in partnership with UNESCO, as one of the year’s top ten most promising digital initiatives.