My research interests revolve around the integration of text, image, video, and audio analysis. In the Informedia Project we have built the News-on-Demand application, which is an instantiation of the Informedia Digital Video Library idea, based completely on automatic methods for processing television and radio news. Through the combination of the strengths of speech recognition, natural language processing, information retrieval and interface design, the system is able to overcome some of the shortfalls inherent in each of the component technologies.
My goal is to utilize large corpora of "found data", that is data that is already available through the Internet or other readily accessible open sources, to improve speech and natural language processing by exploiting advantages across different modalities. It has become clear in recent years that large volumes of text, image, video, and audio can be easily stored and made available for research and applications. However, most of these text, image, video, and audio sources were not produced with computer processing in mind. My intention is to design and build intelligent, understanding programs that help process data from these sources and make the data useful for other applications. This data can be used to improve speech recognition, image understanding, natural language processing, machine learning as well as information retrieval. The challenge is to find the right data, process it into suitable form for training, learning or re-use and build mechanisms that can successfully utilize this data.
Speech and multimedia technology is about to make a major impact on our daily interaction with computers. What is needed at this point are clear demonstrations of the advantages of integrated speech and multimedia interfaces.