CMU eyes `Star Trek' technology

By Steve Segal
TRIBUNE-REVIEW

In the futuristic, science fiction world of Star Trek, different life forms can communicate because of a universal language translator. "It is operated by sensing and comparing brain-wave frequencies, then selecting comparable concepts to use as a basis for translation," the Star Trek Omnipedia says.

Researchers at Carnegie Mellon University's Center for Machine Translation are laying the groundwork for their own version of the language translator. "Star Trek's universal translator is the ultimate goal of what we want here, but we have to do it piece by piece," said Jaime Carbonell, Center for Machine Translation director. The center has been developing computer software to bring down the world's language barriers since 1986. It originally started with a $250,000 annual budget but now has a $2.5 million budget.

As the world continues to shrink, the needs and uses for automated machine translations are growing rapidly. Translations are needed for international trade, medicine, tourism and education. The world market for translation goods, products and services is estimated as high as $30 billion annually, Carbonell said. The Japanese and Europeans have recognized the value of machine translations for decades and were among the industry's founders.

There are two types of uses for language translation. One is to translate specific information to other languages {ed} known as information dissemination. Another use is to get the general meaning of an unknown language {ed} known as information assimilation via translation, Carbonell said.

Information dissemination means users know exactly what information needs to be translated. An example would be if Company X had specific operating instructions for their widget. Because they also sold their widget in Japan, the operating instructions would need to be translated to Japanese. All of the words that need to be translated are known ahead of time. The translator would need to understanding the subtleties of both the English and Japanese languages. Wrong or confusing instructions could lead to a disaster. Since all of the words are known ahead of time, this is also known as a "restricted semantics base." It is the easiest type of translation to do, but requires extreme accuracy.

This type of translation is easier for humans than machines, Carbonell said. A well-trained translator can understand the syntax of words that have multiple meanings. If a person said, "lift a car with a hoist," the hoist would be doing the lifting. But, to say,"lift a car with a license plate," common sense would tell a person the license plate is not doing the lifting. A machine, however, would have a lot more difficulty understanding the difference.

It's quite a different story with information assimilation via translation. This type of translation is not for publication or mass audiences, but for a handful of people. Accuracy is not as important as timeliness. It's sort of a quick scan. An example of this use would be if Company X wanted to find out what its competitors in Japan were doing. A salesman with 50 pages of a competitor's documents may be interested in just a few key paragraphs, but unsure which paragraphs. He could feed the documents to a machine that does information assimilation via translation and get almost instant results.

"It would be sloppier, but faster than a human," Carbonell said. The salesman could then give the relevant information to a human, for a more accurate translation of the key information. This method eliminates a human translator wasting precious time on irrelevant information. "Humans still win in terms of quality," Carbonell said.

The center's goal is to build artificially intelligent systems capable of translating a variety of languages, with a special emphasis on English, Japanese, French, Spanish and German. They have already produced several translation software packages like Pangloss, Shogun, Janus and Kant. Among the projects they are working on is a rapid deployment project for the U.S. Department of Defense: a Serbo-Croatian to English rapid translation system.

Unlike the systems they have been working on for years, these systems must be working in months, Carbonell said. Aside from the technical problems that must be solved, the center always needs more resources. They've found a few native speakers of Serbo-Croatian and reference material, but are looking for more. "Before the fall of the Soviet Union, we knew who the `bad guys' were," said Robert Frederking, a systems scientist at the center.

Today the government does not always know ahead of time what language translations will be needed, Frederking said. The Haiti and Somalia actions are two examples of this need for quick language translations. "We think we know how to bring up a rapid translation system," he said. "It is crude, but gets better every day. Information can be added to the system every day, but the interface stays the same."

The center is also working with other Carnegie Mellon departments on a belt-attached, wearable computer to run the translation software. The center has a home page on the World Wide Web at http://www.mt.cs.cmu.edu/cmt/CMT-home.html.

SEE THESE COMPUTER STORIES AND COLUMNS:

Vegetarians can bloom in cyberspace

CD-ROMs focus on American presidents


Return to Front Page ...