CDAC GIST envisions "A World Where Information Knows No Language Barriers"
1.2 Billion People, 29 states, 7 Union territories, 22 official languages, thousands of dialects - One Nation - India. With the launch of the Government's ambitious "Digital India" initiative, India is poised to become world leader in the Information technology. Support for Indian languages is the cornerstone for the success of such a massive programme.
Under the National e-Governance Plan (NeGP), various Mission Mode Projects are initiated for providing citizen-centric services. Over 12000, Central / State Government websites cater to the needs of the common man, right from e-courts, road transport, healthcare up to education, public distribution system and many more. The penetration of these services will be enhanced if they are offered in local languages. If the fruits of the digital revolution are to trickle down to every citizen, language translation is a prime necessity.
To accomplish the translations of these services / sites in all 22 official Indian languages is a herculean task. The challenge includes: voluminous data, efficiency, cost, quality, skilled manpower and simultaneous release of the information in all the Indian languages.
With this requirement in mind CDAC, GIST has initiated the development of “Localisation Projects Management Framework” (LPMF) on the cloud platform under the aegis of the DeitY, Government of India.
The citizen can now avail of these services in his/her language at a click of button through a browser based plug-in. The plug-in is freely available for download from http://localization.gov.in
The "GIST Online Translation Framework" at the back end does all the wonder of translating the website in the language of user's choice. The framework is backed up with the requisite Natural Language Processing (NLP) tools and technologies and is based on the reuse of Translation Memories, Term Banks, and other linguistic resources including Machine Translation systems. This is a first of its kind framework which has scientific rigor and addresses the challenge of in-context translation.
Currently, the framework leverages the Translation memories, Term Banks generated out of 30 plus Mission Mode Project sites in six different languages. In addition it supports basic translation systems for cognate languages viz. Hindi-Urdu and Urdu-Hindi.
Such a mammoth endeavour can only be handled by a crowdsourced approach. A citizen can contribute / improve the translations voluntarily and be a part of this mission and help us to dissolve the language barriers. The process is simple and the plug-in provides suggestions to help faster translations.
- The framework is language independent and easy to use.
- There is no need to change anything in the source code of the website(s). So it also becomes easier and hassle-free for website owners to make their content available in local languages.
- The GO-Translate framework can be used to translate website(s) dynamically and on the fly just by a click of a button.
- It enables the crowd and translators to contribute and update the translations. In order to translate/post-edit, various MT systems are also integrated to aid the translator in contributing translations.
- A virtual keyboard for all Indian scripts allows the crowd and the translators to edit or contribute a new translation.
- The translation submitted by the translator gets stored on the big-store on a centralized remote server thereby making Indian language content reusable and available for further NLP processing.
- After successful validation of the translations, if the same URL is accessed again, all the previously contributed translations can be applied to the web pages.
Mahesh D. Kulkarni
Associate Director & HoD.