For the average person, conversational AI is not a term that is frequently encountered in our daily lives. Yet, for what the term encompasses, we deal with one form of conversational AI or the other, almost every day. From chatbots sorting out issues in apps like Zomato and Uber, to smart assistants like Alexa and Google Assistant, conversational AI is the technology driving the modern solutions to our modern problems. Don’t feel like getting up to switch on the lights? Tell Alexa to do it for you! Food not getting delivered? Speak to the Zomato Chat bot for an answer! And while it may seem and look rather dumb talking to a machine, we are being amused and entertained by the smart speakers in our living room, asking it do anything from telling our kids bedtime stories to buying things online. Over one-third of 3.5 billion searches on Google are conducted using voice, and in the future, Google expects more than half of all searches will be done via voice.

There’s a stark shift in the way we interact with our machines. From speaking the languages of computers, we are teaching computers to speak our language. We’re relying less on traditional means of computer inputs like typing and touching, and moving towards a paradigm where our voice is the input required. Sales of smart speakers are on the rise with IDC predicting the segment to grow by 32 percent through 2017-2022. Both Amazon and Google share a major slice of the voice assistant pie in India. A recent report by NASSCOM on conversational AI highlights this growing trend. Chatbots are already mainstream with most online businesses now employing some form of a chatbot or the other — Be it an automated Facebook Messenger Chat or WhatsApp for Business. Now, the industry is already experiencing a shift from text-based bots to voice-enabled solutions. But has it been a smooth ride so far? Well, not really.

While conversational AI has made interacting with online stores and apps easier for people, it’s also quite frustrating. Most services work as FAQs, answering only what it knows. There’s also the latency that machines like the Echo speakers face, to fetch the answers, and while it does help to have a human touch to online customer services, the truth of the matter is that they aren’t always very helpful.

Vishal Chahal, a leading executive at IBM figures the problem lies in chatbots, not understanding the intent of the problem. While they are trained to answer a certain number of questions, bots these days are still not able to grasp the intent in the subject matter. Today, the data-set which is used to train the bots are fixed in nature. The bots can’t grow and evolve based on past activity. It also cannot draw conclusions on its own — Things even an unskilled customer executive will be able to figure out.

However, there is technology available to help improve the success rate. Chahal states there are multiple approaches to solving it. “You ask a question. It is backed by a search technology at the back. It searches for candidate answers, and then ranks them using a machine. That’s one approach. The second approach is it first looks for a question-answer pair. If it doesn’t find a pair, then it runs a machine learning model that actually finds the nearest matching fact. That is deep learning. Then there’s what we call it a knowledge graph. Convoluted Neural Networks (CNN) models that people are now using to teach dynamic inference based on what you’re saying. But all these are different approaches to make the content dynamic. What’s still not come is the intent dynamism.”

Fueling the charge of voice-based AI services is the increased focus on natural language generation and understanding. It’s the underlying technology that powers conversational AI. It includes Natural Language Processing (NLP) which allows a computer to understand human language as it is spoken or written, and use various means to first filter out the keywords to look for, analyse the sentiment of the user, translate human language to machine language and use Natural Language Generation and Machine Learning to answer the query. Now, processing language has now become fairly mainstream with most AI services able to understand commands with ease. What is still difficult is the answering part.

How are people using these voice-based services though? The report claims 82 percent of users use voice bots to seek information. A Google Search, a weather check and the like. Around 67 percent use them to play or stream music and videos (count me in that group!) and around 36 percent use to access customer service. A similar number uses voice assistants to buy products and order meals, and around 20 percent of people use AI to book a cab, control smart services and make bill payments.

Here, both consumer and enterprise solutions are coming up en masse. Over one-fourth of startups dabbling in AI are focusing on conversational AI. They include business analytics, chatbot and NLP engines, AI platforms, image processing and more. And while the surge in activity is certainly encouraging, there’s still work to be done.

“I think the key issue which is being faced today in conversational AI is the aspect of nuanced understanding. A human to human interaction can auto correct itself because the human face is also a representation of your emotions, behaviours and kind of a sum total of different expressions. The machine, I don’t think so, understands that today, which is where the challenges come,” Sameer Dhanrajani, Chief Strategy Officer at Fractal Analytics said, highlighting where the challenges lie in adopting AI services.

Surprisingly, a majority of voice searches are coming from India. The NASSCOM report states that 51 percent of internet users using voice-enable functions are from India.

“I think it (voice-enabled usage) presently cater to a demographic where people are not very comfortable using technology like search and so on. So you see a lot of adoption with let’s say, senior citizens with local language support coming in. Most of the new internet users from India are not proficient in English. They prefer doing Google searches in Hindi, Tamil, Kannada and the other local languages. I think this where the voice-based search capability can grow. Because it makes interaction very simple. I think more and more companies, and platforms liek Google should start leveraging this, and they can get a lot more people to adopt to their solutions. They can also cater to a larger demographic. It could tier two, tier three or even senior citizens and not just the urban education population which is already aware of the technology. Voice-based technology needs to be mainstream and simple and easy to use,” Amar BP from, a Samsung-funded startup, added about the uptick in regional language voice searches.

It’s true to an extent. Most new users from smaller cities who speak only regional languages and are not very good with technology, are relying on voice to navigate around the internet. It’s common to see cab drivers use their voice to enter their destination on Google Maps, because typing for him is still difficult. In fact, tech-savvy people are not really using their voice to perform crucial actions. It’s the underpriviledged and the less educated who are using it more. Kids too, are more comfortable talking to a smart speaker to play their favourite music or cartoons.

Advancement of AI will invariably require the access to sensitive data. If we want our machines to behave like humans, we have to nurture the machine with the data that defines us. Access to data will define how smart Alexa or Google Assistant will be, or how well can a chatbot assist you. Ideally, companies collecting data on you should keep you informed of what you’re giving them. But then again, where’s the guarantee that that data collected by governments and private companies aren’t misused?

In the Indian context, data privacy is of prime importance. More than that, people need to be educated about what data they’re giving away. People are raising big concerns about data privacy on social media, and a data privacy bill that has been introduced in the Parliament this year will demarcate which data and how it can be used to train algorithms. We have seen instances of startups that relied heavily on user data to build their products but also misused it for other purposes. Facebook is a prime example of that. More than 80 percent of the Indian public, according to Mr Chahal will still not understand the nuances of data privacy, ten years down the line.

“For the Indian context, let’s be careful about data privacy issues. The public is still not educated on this. And we as technologists, we have a responsibility on this. It’s good not to have that model (of uninformed data collection). It’s great to have the confidence of giving the data and it will look good now because you are given services based on the model which was developed without your consent. But when it goes to a different scenario, a different use case, you might not like it,” Mr. Chahal added.

GDPR is a beacon of light for those believing in protecting their personal data. The European Union passed the GDPR mandate for all companies to inform users about what data companies are collecting about them.

“Eventually, every country will have their own data prevention policy. As practitioners of techologies in this domain, just like doctors have that Hippocratic oath, we will have to swear to protect user data.” Mr. Dhanrajani said.

But how well do you stand by that oath?

Mr. Dhanrajani’s argument is that there’s a culprit in every society. You have to be responsible, sensible and judicious. But how do you that in this case? Awareness of how and where your data is being used is important. The whole complexity around AI needs to be simplified. The population of the last ten years is known as the knowledge population, whose data is being used to train all sorts of algorithms. If we can instill within that population about what’s happening to their data, there can be accountability in the system.

Mr. Chahal agreed. “Certainly, for a country like us for a wider adoptability, it has to be simplified. And as technologists, we do owe it to the society to demystify and be transparent about how we use AI and train AI and being open to be audited. That I think will actually give confidence to everybody around today, from technologists to adopters, to end users, to share and use the data in a ethical manner.

Open-sourcing technology is also another way to instil confidence in artificial intelligence. The more open source we get into, the more awareness gets built. Open-source tools like Google’s Tensor Flow and similar offerings from Microsoft and Apple can help make AI trustworthy.

“Eventually, business models are changing. IBM’s own AI platform, today supports every open source package. And in a place like India, where the developer community is big, we cannot be forcing people to go for a certain set of technology tools. We say you choose the open source technology you want to use, and when you want to scale up the model, you come to us. We will take care of the deployability. But as far as development is concerned, you can choose what you want,” Mr. Chahal said, talking about IBM’s policy of supporting open-source tools.

To sum it all up, it’s clear that there is still a long way to go for conversational AI to become mainstream. While adoption is encouraging, it’s still needs to be taken just as seriously as other means of communicating with our machines. But for our machines to behave like humans, there is also a need for it to keep feeding on our data, which can be problematic if people are not aware of how their data is being used. It’s actually helpful that conversational AI is still at its infancy. It means that there is still time to address the challenges and obstacles to build machines that are unbiased, really intelligent and actually helpful.


As published in Digit under their section Machine Learning and AI by Subhrojit Mallic

Share This Post

Leave a Reply