LTI's Alan Black Part of Team Making Software Available for Free
Millions of visually impaired people in India may benefit from free, open-source software for Android devices that converts electronic text written in Indian languages into messages they can hear.
The text-to-speech (TTS) software, developed by Carnegie Mellon University in collaboration with the Hear2Read project, can now be downloaded free of charge from Google Play. Tamil is the first language offered, with subsequent releases of seven major languages — Hindi, Bengali, Gujarati, Marathi, Kannada, Punjabi and Telugu — expected over the remainder of the year.
Four out of five people in India speak one of those eight languages. India has 22 official languages in all. More than 62 million Indians are visually impaired.
"We're looking to create speech output for as many languages as possible," said Suresh Bazaj, a serial entrepreneur in the San Francisco Bay area and founder of Hear2Read.
TTS software is commonplace in the United States and many parts of the world, but Bazaj said good quality TTS for Indian languages is difficult to find, difficult to use or unaffordable. Yet the need is great — only 10 percent of blind children in India get any education, and 90 percent of visually impaired Indians live in poverty.
"Making it available as free, open-source software thus was a key goal," said Alan Black, a professor in the School of Computer Science's Language Technologies Institute (LTI). "People should be able to download this and it should just work. We put a lot of effort into making this accessible and easy to use."
Bazaj met Black, a scientist internationally known for his work in speech synthesis, through Alok Parlikar, a former student of Black's, two years ago and recruited him to the project. While the LTI had a wealth of knowledge and tools for creating TTS software, the Hear2Read project inspired Black and his students to develop a system for doing so repeatedly, efficiently and for producing user-friendly software.
"Each language is different, and historically TTS systems have been done one at a time," Bazaj said. "We looked at commonalities of Indian languages and developed tools to apply the same technology to multiple languages."
The system developed by Black's research team enables creation of a baseline TTS system after recording 2–3 hours of clear, consistent speech from a native speaker. The open-source text read by the speaker comes from various sources such as Wikisource, books and periodicals. (Check out the video below.)
Though the machine learning process used to create voice databases requires large-scale computing, the resulting database for each language is relatively small and can run on low-end Android phones or tablets that retail for less than $100 (7,000 Indian rupees). That cost threshold is within guidelines established by the Government of India's Assistance for Disabled Persons program, which helps people with disabilities purchase assistive devices based on income.
The conversion from text to speech is done in real-time without internet access, as most people in India either do not have continuous internet access or cannot afford it.
The Hear2Read app works with the Android Talkback accessibility option that allows people with low vision to use applications such as web browsers, email, SMS (texting), phone calls, word processors, spreadsheets and book readers.
For Bazaj, this project has personal meaning. He has had retinal detachments in both eyes that were successfully repaired. He was fortunate to have access to excellent medical care, which is not the case for most people in India. He believes the ability to read is directly related to a good quality of life, and so his mission began.
"Like any startup, I jumped into it not knowing how deep the pool was," Bazaj recalled. After meeting Black, he began supporting a CMU student to develop TTS for Indian languages. In addition, he has recruited more than 50 native Indian speaking volunteers based in the United States and India.
"This project couldn't have been accomplished without the dedication and support provided by our selfless volunteers," Bazaj said. The San Francisco Bay Area non-profits Access Braille and Indians for Collective Action have provided funding to support the project.
Carnegie Mellon University is a private, internationally ranked research university with programs in areas ranging from science, technology and business, to public policy, the humanities and the arts. More than 13,000 students in the university's seven schools and colleges benefit from a small student-to-faculty ratio and an education characterized by its focus on creating and implementing solutions for real problems, interdisciplinary collaboration and innovation.
Hear2Read is a volunteer organization dedicated to bridging the digital divide for blind and low-vision Indic language populations. Our mission is to open doors to all education, employment and business opportunities for the visually challenged.