Lemur Search   
Language Technologies Institute
Carnegie Mellon University
School of Computer Science

Ph.D. Program in Language and Information Technologies

Please see the LTI Policies and Procedures for the most thorough, up-to-date set of rules concerning this program.

Introduction

The School of Computer Science at Carnegie Mellon University is pleased to announce the creation of a Ph.D. program in Language and Information Technologies. The new program is designed to build on CMU's strengths in computational linguistics, machine translation, information management, and speech understanding by offering a program focusing on these particular areas. These fields have shown considerable growth in recent years and are poised for further breakthroughs, which take advantage of emerging technological infrastructures such as the World-Wide Web, mobile computing, and multimedia interfaces. Please note that:

  • Computational Linguistics includes parsing, generation, representation, machine translation, and key-fact mining of text.
  • Speech Understanding includes speaker-independent recognition, large vocabulary dictation, task-oriented interfaces and speech-to-speech MT.
  • Information management includes text retrieval, indexing, summarization, categorization and data-base infrastructure.
  • And multimedia systems combines many of the above technologies and extends them beyond text and speech into animation, video, and virtual reality.

And multimedia systems combines many of the above technologies and extends them beyond text and speech into animation, video, and virtual reality.

The Language and Information Technologies Ph.D. is broader in scope than the Computational Linguistics (CL) Ph.D. previously offered in the School of Humanities and Social Sciences, but it incorporates a considerable number of the course topics covered in that program; tenured CL faculty will actively participate in the our program.

Program Requirements

The Ph.D. in Language and Information Technologies consists of the following components:

  • Successful completion of a set of mandatory courses
  • Mastery of certain proficiencies
  • A program of research culminating in a Ph.D. thesis.

Course Requirements

The LTI curriculum was revised in Spring 2001 to eliminate the "core course" concept. Please see the LTI Policies and Procedures for more information. The course requirement consists of eight (8) courses from the list of LTI focus areas. Students should select specific courses in consultation with their advisor, keeping in mind that not all courses are offered each year. (Note: each 6-unit lab course counts as one half of a course towards the total eight required.)

Upon completion of the eight required courses, students may choose to take additional courses as electives. Students may select these courses from the LTI list, or from those offered in the Computer Science Department or other CMU or Pitt departments. Students interested in speech should consider speech-oriented electives; other areas of interest include Linguistics, Statistics, and Human-Computer Interaction (HCI).

When selecting the eight required courses and electives, at least one course must be selected from each focus area.

All students must also enroll (for a minimum of two sections) in the Language Engineering laboratory, which includes hands-on work in four different laboratory modules (Speech, Machine Translation, Information Retrieval, and Natural Language Analysis). The lab modules are self paced, with TA and faculty guidance (As mentioned above, each lab counts as one half of a course towards the total eight required.)

Model Curriculum

The following gives a possible Ph.D. curriculum for a student specializing in Machine Translation. Specializations in Speech, Information Retrieval, and Multimedia Systems will be similar in structure, with appropriate course substitutions.

Example Curriculum
Semester 1 Semester 2
Year 1 Linguistic Basis of NLP
Algorithms for NLP
Self-paced Lab
Research
Machine Translation
Artificial Intelligence
Self-paced Lab
Research
Year 2 Software Engineering for LT (I)
Statistics for NLP
Research
Software Engineering for LT (II)
Principles of Translation
Research
Year 3 Teaching (TA)
Research
Thesis Proposal
Research
Year 4 Elective or Seminar
Research
Elective or Seminar
Research
Year 5 Research Thesis Defense

Proficiencies

The following skills must be demonstrated in the course of graduate study, with flexibility in the form and timing of their demonstration:

Writing Satisfied via a conference paper or article that has passed peer review, or via a longer internal paper or report reviewed by several faculty. The topic of the paper may be the student's research results, a comprehensive survey of a research area, a linguistic analysis paper, or any other pertinent topic.
Presentation Satisfied via a public presentation of reasonable quality, such as an external conference presentation or an internal seminar presentation reviewed by several faculty.
Programming Normally the programming requirement will be satisfied in the course of a student's research and/or project work, but it could also be satisfied via explicit apprenticeship if desired.
Teaching Satisfied by assisting in the teaching of a class (i.e. being a TA for a semester) including the planning of a portion of the syllabus and exercises, as well as delivery of some lectures under faculty supervision.

Research and Ph.D. Thesis

It is expected that all Ph.D. students engage in active research from their first semester. Moreover, advisor selection should occur within 1-2 months of entering the Ph.D. program, with the option to change at a later time. Roughly half of a student's time should be allocated to research and lab work, and half to courses until these are completed.

The dissertation proposal, normally presented at the end of the third year, should be a document specifying:

  • The general area of investigation, and the specific problem(s) addressed.
  • A clear argument for the significance of this problem, and the expected scientific contributions in the proposed work towards its solution.
  • Relevant past and on-going research, including competing approaches.
  • Description of work to date to establish a measure of credibility with respect to the proposed research, including any preliminary results.
  • Description of work remaining to be done, including theoretical framework, and/or system building and/or experimentation and evaluation metrics.
  • A projected timeline for completion.

A dissertation committee consisting of the advisor, at least two other CMU faculty in language technologies, and at least one external member should be approved prior to the proposal. Note: University rules require that the time and place of the proposal presentation be publically announced at least one week prior to the presentation. This should be coordinated with the Chair of the Graduate Programs.

The dissertation itself, normally completed during the fifth year, should include a detailed description of all the work done, including a clear evaluation and discussion of final scientific contributions. There are no fixed style or document length guidelines or requirements. The dissertation defense is a public presentation and defense of the dissertation results. Note: University rules again require that the time and place of the dissertation defense be publically announced at least one week prior to the defense. This should also be coordinated with the Chair of the Graduate Programs.

Financial Support

Whereas all Ph.D. students will receive some form of financial support, the exact form of that support may vary. Possible forms of support include external fellowships, research assistantships (RAs), and teaching assistantships (TAs). RAs require a measure of project work, and TAs require teaching each semester.

Language Technologies Institute • 5000 Forbes Ave • Pittsburgh, PA 15213-3891 • (412) 268-6591