November 13 Petros Boufounos
Mitsubishi Electric Research Laboratories
Recent Advances in Signal Acquisition, Sensing, and Quantization
Abstract:
The increasing availability of computing power, thanks to the advances of Moore's law, has put significant pressure on sensing technology to follow suit. Although sensor hardware cannot always keep pace, recent theoretical developments such as Compressive Sensing and Computational Imaging have demonstrated how smart sensor design can exploit cheap computation to improve sensing and signal acquisition technology. The hallmarks in these theoretical developments are randomization, non-linear reconstruction, and emphasis on signal models.
In this talk we emphasize how a model of the acquisition system can be taken into account in the design of the reconstruction algorithms. Specifically, we examine how the randomization of the measurements interacts with measurement quantization in analog-to-digital conversion. We first consider finite-range quantizers and demonstrate that, counter to common intuition, we can often decrease the error due to quantization by increasing the saturation rate of the quantizer. Then we consider the extreme case of 1-bit quantization and we demonstrate that we can significantly improve performance by explicitly incorporating the appropriate quantization model in the reconstruction. Finally, we consider signals measured though non-linear distortions, and we demonstrate that we can still reconstruct the signal from the measurements, even if the distortion itself is not known.
Bio:
Petros Boufounos completed his undergraduate and graduate studies at MIT. He received the S.B. degree in Economics in 2000, the S.B. and M.Eng. degrees in Electrical Engineering and Computer Science (EECS) in 2002, and the Sc.D. degree in EECS in 2006. Since January 2009 he is with Mitsubishi Electric Research Laboratories (MERL) in Cambridge, MA.
Between September 2006 and December 2008, Dr. Boufounos was with the Digital Signal Processing Group at Rice University doing research in the area of Compressive Sensing. Before that he was a postdoctoral associate in the MIT Digital Signal Processing Group. In addition to Compressive Sensing, his immediate research interests include signal processing, data representations, frame theory, and machine learning applied to signal processing. He is also looking into how compressed sensing interacts with other fields that use sensing extensively, such as robotics and mechatronics. Dr. Boufounos has received the Ernst A. Guillemin Master Thesis Award for his work on DNA sequencing and the Harold E. Hazen Award for Teaching Excellence, both from the MIT EECS department. He has also been an MIT Presidential Fellow. Dr. Boufounos is a member of the IEEE, Sigma Xi, Eta Kappa Nu, and Phi Beta Kappa.
October 30 C. Lee Giles
Pennsylvania State University
SeerSuite: Enterprise Search and Cyberinfrastructure for Science and Academia
Abstract:
Cyberinfrastructure or e-science has become crucial in many areas of science as data access often defines scientific progress. Open source systems have greatly facilitated design and implementation and supporting cyberinfrastructure. However, there exists no open source integrated system for building an integrated search engine and digital library that focuses on all phases of information and knowledge extraction, such as citation extraction, automated indexing and ranking, chemical formulae search, table indexing, etc. We propose the open source SeerSuite architecture which is a modular, extensible system built on successful OS projects such as Lucene/Solr and discuss its uses in building enterprise search and cyberinfrastructure for the sciences and academia. We highlight application domains with examples from computer science, CiteSeerX, chemistry, ChemXSeer, and archaeology, ArchSeer.
CiteSeerX, the successor to CiteSeer, currently offers or intends to offer some unique aspects of search not yet present in other scientific search services or engines, such as table, figure, algorithm and author search. In addition, CiteSeerX continuously crawls the web and author submissions and now has nearly 1.5 million documents, close to 30 million citations, a million authors and comparable database tables. It has nearly 1 million unique users with several million hits a day.
In chemistry, the growth of data has been explosive and timely, and effective information and data access is critical. The ChemXSeer (funded by NSF Chemistry) system is a portal and search engine for academic researchers in environmental chemistry, which integrates the scientific literature with experimental, analytical and simulation datasets. ChemXSeer consists of information crawled from the web, manual submission of scientific documents and user submitted datasets, as well as scientific documents and metadata provided by major publishers. Information gathered from the web is publicly accessible whereas access to restricted resources such as user submitted data will be determined by those users. Thus, instead of being a fully open search engine and repository, ChemXSeer will be a hybrid one, limiting access to some resources.
Because such enterprise systems require unique information extraction approaches, several different machine learning methods, such as conditional random fields, support vector machines, mutual information based feature selection, sequence mining, etc. are critical for performance. We draw lessons for other e-science and cyberinfrastructure systems in terms of design, implementation and research and discuss future directions and systems.
Dr. C. Lee Giles is the David Reese Professor at the College of Information Sciences and Technology at the Pennsylvania State University, University Park, PA. He is also Professor of Computer Science and Engineering, Professor of Supply Chain and Information Systems, and Director of the Intelligent Systems Research Laboratory. He directs the CiteSeer^X <http://citeseerx.ist.psu.edu> project and codirects the Chem_X Seer <http://chemxseer.ist.psu.edu> project at Penn State. He has been associated with Columbia University, the University of Maryland, University of Pennsylvania, Princeton University, the University of Pisa and the University of Trento.
October 23 Alex Waibel
LTI
From Research Lab to Jungle Ops: Computer Speech Translation for Humanitarian Relief
Abstract:
Our world continues to be rocked by natural disasters and political calamities at unpredictable times and places, generating humanitarian needs that can often not be met locally alone. As the world responds, however, relief organizations and volunteers face communication challenges imposed by language barriers and cultural differences.
To satisfy the communication needs, we have worked for many years on building and deploying speech communication systems that would provide human-to-human language interpretation and have explored them in actual field use. In addition to the scientific problems associated with speech and language technology, such a goal harbors numerous practical, logistical and financial pitfalls that one stumbles upon when one leaves the comfort of our research labs.
In this talk I will tell the story of this (ongoing) adventure in three parts:
1.) Technology: How does a speech-to-speech translation system work and what does it do? What levels of performance can we expect, and what language assistance can it offer? Can it handle many languages and how flexible is it to be used in different field situations? What interfaces and what platforms are needed for field use? How can it be built, maintained and ported inexpensively?
2.) Money: How much does it cost to develop a speech translator? Who pays for the development of such systems? I will discuss our Robin Hood approach: we sign up people in the developing world to provide language expertise over the internet; we build, improve and market the technology in the developed world; with the proceeds we improve and adapt the technology for health care missions and redeploy it in the developing world. We have formed several companies around the world that provide system development, product distribution, marketing, and data collection.
3.) Deployment: We examine, how it all fits together and what lessons we have learned from three different healthcare missions that we are attempting to support: in the mountains of Honduras, remote villages of Thailand, and in the jungles of Papua, Indonesia.
Warning: No equations, but plenty of system demos, pictures, movies and 'war' stories from the field.
Bio:
Alex Waibel is a Professor of Computer Science at the Language Technology Institute at Carnegie Mellon University, and at the Institute of Anthropomatics at the University of Karlsruhe. He directs the international Center for Advanced Communication Technologies (InterACT) with research interests in multimodal and multilingual human communication systems. Dr. Waibel's team pioneered many of the first domain-limited and the domainunlimited speech translators. He was one of the founders and chairmen (1998-2000) of C-STAR, the consortium for speech translation research. He has published extensively in the field, received several patents and awards, and built several successful companies. He received his BS, MS and PhD degrees at MIT and CMU, respectively.
October 22 Tuomas Virtanen
Tampere University of Technology
New Machine Learning Algorithms Enable Recognizing Sounds in Mixture
Abstract:
Automatic recognition of sounds in the presence of interfering sounds has been an unsolved problem for a long time, which has caused severe problems in applications such as automatic speech recognition (ASR). Additive noise degrades significantly the performance of conventional pattern recognition algorithms, for example the hidden Markov models used in ASR. New machine learning algorithms based on factorizing the sound spectrogram have proven to be good tools in separating sounds from mixtures. Specifically, non-negative spectrogram decompositions are able to learn the structure of the data automatically without the need to tune sophisticated separation algorithms. We present source separation algorithms and models based on non-negative matrix factorization and maxrix deconvolution, and extensions such as the source-filter model. We discuss both supervised and unsupervised use of the methods. The ability of the models to deal with mixtures of sounds has provided superior results in the recognition of noisy signals. We present state-of-the art results in robust speech recognition, speech enhancement, and music information retrieval. Audio demonstrations will be given.
Bio:
Tuomas Virtanen is a senior researcher at Tampere University of Technology, Finland. He finished his PhD about monaural sound source separation in 2006. His research interests include sound source separation, noise-robust automatic speech recognition, audio content analysis, and music information retrieval.
October 16 Chris Dyer
University of Maryland
Improving Machine Translation by Propagating Uncertainty
Abstract:
NLP systems typically consist of a series of components where the output of one module (e.g., a word segmenter) serves as input to another (e.g., a translator). To avoid the problem of compounding error propagation (cf. Finkel et al. 2006), downstream components need to consider more than a 1-best hypothesis from their upstream dependencies. I give examples of this problem and its solution in machine translation, where sources of upstream uncertainty include not only the inherently noisy outputs of statistical preprocessors (such as word segmenters and STT systems), but also "development-time" decisions (such as determining what the appropriate granularity of the lexical units is or how much text normalization to do). I present results showing that translation quality can be improved over state-of-the-art baselines by considering a distribution over possible inputs and that this can be done at little runtime cost if the input is properly structured.
I then explore the challenges and opportunities presented by doing learning in a world where, rather than having an unambiguous gold labeling for each training instance, a distribution or set of possible labels is given. To account for this ambiguity, I use a conditional random field model with latent variables (cf. Petrov and Klein 2007) and describe its application to the problem word segmentation for machine translation. In this task, there may be multiple plausible gold segmentations for words, and I incorporate all of them during training. I conclude the talk by discussing ongoing work where this approach to training is applied translation modeling itself. Since efficient inference in these models is notoriously challenging, I also describe an apparently novel synchronous parsing algorithm that has better asymptotic run-time than previously described ones (e.g., Wu 1997).
Bio:
Chris Dyer is a Ph.D. candidate at the University of Maryland, College Park, in the Department of Linguistics under the supervision of Philip Resnik. His research interests include statistical machine translation, computational morphology and phonology, unsupervised learning, and scaling NLP models to deal with larger data sets using the MapReduce programming paradigm. He is graduating this spring and will be joining the LTI as a postdoc in Noah Smith's lab.
October 9 Jianchang Mao
Yahoo Labs
Learn-to-Rank-Ads in Computational Advertising
Abstract:
Hundreds of millions of internet users have been enjoying a plethora of free web services, ranging from search, email, news, sports, finance, and video, to various social network services. Most of these free services are fueled by online advertising, a multi-billion dollar industry. Yet, online advertising spending is still about 10% of the global advertising market of nearly half a trillion dollars. As more users spend more time online, advertisers are moving more budgets to online advertising. The rapid growth of online advertising has created enormous opportunities as well as technical challenges that demand computational intelligence. Computational Advertising has emerged as a new interdisciplinary field in solving challenging problems that rise in online advertising. The central problem of computational advertising is to find the best matching ads from a large ad inventory to a user in a given context (e.g., query, page view) under certain business constraints (blocking, targeting, guaranteed delivery, etc).
In the first part of this talk, I will provide a brief introduction to various forms of online advertising, including search advertising, contextual advertising, guaranteed and non-guaranteed display advertising. For each form of online advertising, I will describe the problem formulation and its accompanying computational challenges.
In the second part of this talk, I will provide a case study on Learn-to-Rank-Ads to illustrate how machine learning techniques can be employed to attack the central problem in computational advertising. Learning to rank has attracted attention of many machine learning researchers in the last decade. A number of Learn-to-Rank algorithms have been proposed in the literature. Until recently, most learn-to-rank algorithms were not using a loss function related to popular relevance measures, such as NDCG (Normalized Discounted Cumulative Gain) and MAP (Mean Average Precision). The main difficulty in direct optimization of these measures is that they depend on the ranks of objects, not the numerical values output by a ranking function. We propose a fully Bayesian framework that addresses this challenge by optimizing the expectation of NDCG measure over all the possible permutations of objects. A relaxation strategy is used to approximate the expectation of NDCG over the space of permutation, and a bound optimization approach is employed to make the computation efficient. Extensive experiments show that the proposed algorithm outperforms state-of-the-art Learn-to-Rank algorithms on several benchmark data sets.
Bio:
Dr. Jianchang (JC) Mao is currently the head of Contextual & Display Advertising Sciences in Y! Labs, responsible for the R&D of Contextual Advertising, Display Advertising, Targeting, and Categorization technologies and products. He was also a Science/Engineering director responsible for development of Sponsored Search Matching technologies and backend technologies for several Yahoo! Social Search products, including Y! Answers and Y! MyWeb (Social Bookmarks). Prior to joining Yahoo!, Dr. Mao was Director of Emerging Technologies & Principal Architect at Verity Inc., a leader in Enterprise Search (acquired by Autonomy), from 2000 to 2004. Prior to this, Dr. Mao was a research staff member at the IBM Almaden Research Center from 1994 to 2000. Dr. Mao's research interest includes Machine Learning, Data Mining, Information Retrieval, Computational Advertising, Social Networks, Pattern Recognition and Image Processing. He received an Honorable Mention Award in ACM KDD Cup 2002, IEEE Transactions on Neural Networks Outstanding Paper Award in 1996, and Honorable Mention Award from the International Pattern Recognition Society in 1993. Dr. Mao served as an associate editor of the IEEE Transactions on Neural Networks, 1999-2000. He received his Ph.D. degree in Computer Science from Michigan State University in 1994.