India – a country of 22 languages written in 13 different scripts, 720 dialects and thousands of unofficial languages – has always been a linguistic enigma to the world. In the era of digital transformation, technology is bringing together stakeholders beyond geographic and cultural barriers. With the advent of deep technologies like Artificial Intelligence, Machine Learning and Natural Language Processing, India’s many languages are looking for a digital revamp and transliteration.
Currently, Indian language Internet users are growing at 13% annually compared to the 1% growth of English-speaking Internet users. Nine out of 10 new Internet users from 2016 to 2021 will use local languages,
Internet companies like Google, WhatsApp and Facebook too cater to regional language speakers now, Facebook supports 13 Indian languages, while several apps on Google’s playstore are supporting local languages. More than a million bookings on ride-hailing app Ola are made in local languages, daily. Microsoft recently opened up speech data in Gujarati, Tamil and Telugu for researchers to build speech-recognition systems for Indian languages. In addition, the company is also working on improving real-time translation of local languages using AI and NLP tools.
By aiding the rapid and accurate translation of Indian languages, technology providers and product companies are scaling their reach to the furthermost corners of the world. The next billion users exist in the remote corners of the country, and a concentrated focus on sharpening technology tools for Indian languages is going to boost Internet penetration and digital transformation across India.
In an effort to educate and empower the startup and industry ecosystems about the transformative power of NLP, NASSCOM Center of Excellence for Data Science & Artificial Intelligence (DSAI) organized a session on the technical aspects of NLP for Indic languages in Bangalore, headed by Vivekananda Pani, CTO and cofounder of Bangalore-based Reverie Language Technologies. Reverie Language Technologies is one of India’s foremost companies creating Indic language technologies. The company has developed comprehensive language technology solutions for Indic localization and user engagement on digital platforms. Recently, Reliance Industries bought a majority stake in the company for Rs 190 cr and will further invest Rs. 177 cr by March 2021 – a testament of the fast-growing market for digital products and services for vernacular languages.
The session covered:
Fundamentals of Indic NLP
- Brief background to the development and evolution of Indic computing
- Major Challenges and Strategies in the space
- How is Indic NLP different from English
- Solution needs in the market and challenges specific to Indic
- Strategies and approaches
Introduction to advanced NLP solutions (Machine Translation and ASR) with use cases
- Use of Machine learning for NLP
- Things to do in developing a Machine translation system, beyond the machine learning
- Things to do in developing a ASR system, beyond the machine learning
Watch the full video here