Lemur Search

Home


About

   Directions
   Admissions

   How To Apply

   The LTI Brochure


Education

   Ph.D.
   M.S.

   Undergrad Minor

   Courses

    FYI


LTI Forms

Seminars
   LTI Seminar Series
   Joint Speech Seminar (JSS)

   Machine Translation (MT)

   Student Research Symposium

   Information Retrieval Series


Visitor Information
   General
   Maps & Directions
   Hotel Links
   Parking Information


Research
   Projects

   Reports

   Dissertations


People
   Faculty

   Students

   Upcoming Graduates

   Staff

   Visitors   

   Who to See for What


Contacts




About the Language Technologies Institute

The Language Technologies Institute (LTI) at Carnegie Mellon University (CMU) conducts extensive research on Computational Linguistics, Machine Translation, Speech Recognition and Synthesis, Information Retrieval, Computational Biology, Machine Learning, Text Mining, Data Mining, Knowledge Representation, and Intelligent Language Tutoring.

Our “Bill of Rights” is:

 


Upcoming Events

Yanjun Qi
LTI Ph.D Thesis Defense
Learning of Protein Interaction Networks

Wednesday, May 14, 2008
11:00am
Newell-Simon Hall 3002


Jae Dong Kim
LTI Ph.D Thesis Proposal
Chunk Alignment for Corpus-Based Machine Translation

Thursday, May 15, 2008
9:00am
Newell-Simon Hall 3002

 

 

Shekar Sivasubramanian
(Self-Defined LTI Ph.D Student )
Knowledge Management for Software Reuse in Global Development

Thursday, May 15, 2008
Wean Hall 4623 3:00pm


Abstract: The late 1990's have seen the emergence of global software develoopment (GDC's) located in different parts of the world to serve the software development needs of companies in the United States and Europe. A GDC forms a large-scale, economic model for the remote development of software driven by cost benefits offered by the workforce in these locations. As the nature of work in these locations has changed from maintenance and migration related work to core development, GDC's can provide an aggregation of technical talent which offers a significant opportunity for software reuse, thus improving the productivity and quality of delivered software solutions (Basili: 1994). Since effective reuse involves both socio-economic and technical challenges (Griss: 1991), software reuse may require a change in software development practices within the GDC.

Software reuse in GDC requires the availability of appropriate knowledge at the correct time across project teams, geographies, domains, technologies, and time. The application of knowledge related practices requires the software development process to be aware of the knowledge resources that are consumed and enriched during software development activities. This talk proposes the use of a formal and defined knowledge management framework that includes key knowledge management practices and associated tool that can be integrated into the software development practices in a GDC enviromnent. The tool is supported by a formal specification and an ontology that will be encoded for use. The talk concludes with the definition of an experiment to be carried out using the proposed tools and practices to validate a set of hypotheses associated with the utility of knowledge management practices for software development.


IR Discussion Series

Hui (Grace) Yang
Ontology Learning by Supervised Hierarchical Clustering

Friday, May 16, 2008
Newell-Simon Hall 3002
12:00pm


LTI Seminar Series


 

2008 Commencement!

Doctorate Hooding Ceremony
SCS Ceremony & Reception



Research Highlights


Language and Politics

William Cohen, Noah Smith, and Tae Yano

Most approaches to automatic text analysis and processing treat text as a stream of words or sentences. A typical underlying assumption is that the use of language in the data is literal and that the data represent facts. Many genres, however, do not have these features.

We are exploring automatic methods for analyzing text in the political domain, specifically blog posts on topics pertinent to the 2008 United States Presidential Elections. Political text is often indirect, sarcastic, repetitive, hyperbolic, emotional, biased, manipulative, and riddled with unstated assumptions. Our aim is to automatically separate useful, thoughtful information from redundant "spin," using statistical natural language processing techniques and a data-driven methodology that makes use of the insights of political scientists.

The broader impact of this work will consist of a renewed emphasis exploiting domain knowledge together with text data for more powerful natural language understanding technology, as well as software tools that will promote more informed decision-making among American voters.

 




 


In the News


CMU and LTI First To Use Yahoo!'s New Supercomputing Center

Yahoo! Inc is assisting research at the LTI by providing access to a 4,000-processor supercomputer running open-source distributed computing software such as Hadoop and the Pig parallel programming language. The initial group of researchers using the system include Jamie Callan (information retrieval), Noah Smith (natural language processing), and Stephan Vogel (machine translation). "We are excited about collaborating with Yahoo! on systems software research, helping to advance the state-of-the-art, and creating new research possibilities in this critical area," said Randall E. Bryant, dean of the School of Computer Science at Carnegie Mellon. For more information, see the Yahoo! press release


 

Social Networking Project Emphasizes Compatible Minds

Incoming CMU freshmen will have the chance to try a new social networking site called Mindkin, developed by four SCS graduate students: Ulas Bardak, Betty Cheng, and Vasco Pedro of the LTI. and Jahanzeb Sherwani of the Computer Science Department. Bardak says he and the other students began working on Mindkin two years ago because existing sites seemed superficial, particularly in the emphasis given to photos.

Mindkin’s central feature is “Thought Stream,” a screen on which ideas submitted by users scroll by. A system of credits forces users to be selective in identifying ideas they like or dislike,which makes it impossible for someone to simply “like” all of the ideas scrolling through Thought Stream. If a user likes enough ideas from the same author, that author’s identity is eventually revealed so direct contact can be made. 

The Mindkin braintrust has received a provisional patent on the concept and is looking for ways to commercialize it.

The Olympus Project has adopted it as a PROBE and will feature the social networking site at its next “Show and Tell” for venture capitalists on Sept. 25 in the Collaborative Innovation Center.
 

News contributed by Byron Spice

 

 


The LTI Webmaster



LTI is part of the School of Computer Science at Carnegie Mellon University.
This page is maintained by The LTI Webmaster