| |
Multilingual and Heritage Computing
Mission:
Dissolving Language Barriers to place the power of computing and e-contents in the hands of the people of India |
India is a country with 22 official languages and use of computers is fast spreading not only to create employment in the IT sector but also to support productive use of IT in daily life - increase productivity and competitiveness, provide better quality of life, enable inclusiveness and strengthen democracy. Ability of different sections of people to use computers (and increasingly text and data over mobile phones) demand that the Basic Information Processing Kit for Indian languages is constantly upgraded for various hardware and software platforms, new tools added and work promoted with developers, ISVs and System Integrators and application developers to enable/support Indian languages use in different sectors/verticals. And increasingly, Indian language content in Digital form has to be created and supported for applications to be supported and reach a critical mass.
Also, a whole range of new emerging technology tools and capabilities from Machine assisted Translation and OCR/OHR to Cross Lingual Information Retrieval (CLIR), Web 2.0, Indian language Browser, Speech interfaces (Text to speech, speech to text and speech to speech) and Search Engines - mature in English but in fast emerging major languages - are supported in Indian languages as well. In addition, development of support to .IN domain with Indian language domain names is another target area of work.
In Multilingual Computing and Allied Areas, C-DAC continues to work towards the design development and deployment of technologies /solutions for the following areas:
- Speech Processing
- Speech Recognition
- Speech Synthesis
- Natural Language Processing (NLP)
- Machine Translation
- Information Extraction & Retrieval (IR)
- Semantic Search
- Optical Character Recognition (OCR)
- Indian Languages OCR
- Indian Language On-Line Handwriting Recognition (OHR)
- Localisation
- Fonts (TTF & OTF) for Indian Languages
- Data Processing Tools
- Standardization in Localization benefiting e-governance
- Localisation of Middleware
- IDN & E-mail Id in local languages
- Transliteration amongst Indian Languages
Speech Processing
- Speech Recognition
- Speech corpus creation, analysis and management tools
- Phoneme and grapheme mapping tools
- Text conversion tools
- Speech Synthesis
- Speech corpus creation, analysis and management tools
- Speech corpus creation, analysis and management tools
- Phoneme and grapheme mapping tools
- Text parsing tools
- Speech synthesis tools
- Learning and training modules
- Speech parameter control module
- Intonation and prosodic rule generation
Natural Language Processing
- Machine Translation
- Corpus creation, analysis and management tools
- Pre-processing and post-processing tools
- Parsing and generation tools
-
Information Extraction and Retrieval (IR) of English/Hindi IE/ IR System for the domains of Banking, Agriculture and Railways, Mobile Services; Cross-lingual IE/ IR system using domain specific developed translation Systems; Knowledge based as well as generic Search Engines; Summarizer for English and Hindi, etc.
-
Semantic Search
Semantic Search attempts to augment and improve traditional search results (based on Information Retrieval technology) by using data from the Semantic Web, and adding Indian languages to Semantic Search
Optical Character Recognition (OCR)
- Indian Languages OCR
- Language independent components, such as image cleaning, skew adjustment, image detection, column detection, table detection, etc.
- Font training module.
- Document analysis module backed by dictionaries, spell checker and auto language detection tools.
- Aligning analyzer, recognition and generator modules.
- Indian Language On-Line Handwriting Recognition (OHR)
- Language independent components, such as image cleaning, skew adjustment, image detection, column detection, table detection, etc.
- Document analysis module backed by dictionaries, spell checker and auto Language detection tools.
- Aligning analyzer, recognition and generator modules.
Localisation
Click here to know about Multilingual Products »

|
|