AI learns Hindi: How a health website achieved 86% accuracy in medical diagnosis
By training its AI models to understand vernacular language, myUpchar was able to increase the accuracy of online health diagnoses.
By Soumik Ghosh
The mobile healthcare market in India is booming. It’s set to expand at a compound annual growth rate of 31%, reaching INR 13,800 crore by the end of 2024, according to the report “Healthcare Apps Market in India 2019” from Research and Markets. The COVID-19 outbreak has only catalysed the sector’s rapid growth.
Urban India is no stranger to online heath consultation thanks to the likes of Practo and WebMD. Now, though, online healthcare companies are seeing increased traction in tier-2 and tier-3 cities too.
In fact, it’s smaller cities, and not metros, that are now the biggest growth drivers in the health insurance domain.
Two Stanford graduates, Rajat and Manuj Garg, identified the business opportunity these cities represented in 2016, and created myUpchar to meet the need for reliable healthcare consultation.
More recently, with the increase in demand for online health consultation, myUpchar turned to AI to improve the efficiency of registered doctors in their network by examining patients’ inputs, combining them with their existing data, and suggesting a diagnoses to the doctors.
Rajat Garg, CEO and co-founder of myUpchar, said that the company has done over 10 lakh free consultations and this turned out to be a source for quality data.
“This data, along with our research, is used to provide the right diagnosis and medications for the queries. Our system then analyses user inputs which are combined with our data to generate a diagnosis and prescription,” he explained.
He added that the technology helped achieve a high accuracy level which decreased the time per consultation to 1–2 minutes.
The linguistic challenge
Garg said that the foremost concern was the scarcity of research in the Hindi AI space. Choosing the right kind of AI technology was a crucial task and regular evaluation of metrics for different models was essential to achieve optimal results.
“Hindi translation is quite a difficult task due to the varied ways in which a Hindi word is spelled by the Indian population. For example, the English word ‘disease’ can be typed as ‘bimari’ or ‘beemari’. To solve this issue, we created our own dictionary to map keywords to the diseases,” he explained.
The development team used a mix of few models for word stemming, stop words, translation, and identifying the root words in different languages. It then used the BERT model for Natural Language Processing (NLP).
BERT, the Bidirectional Encoder Representations from Transformers model, was first developed by Google researchers in 2018. The open-source code has since achieved state-of-the-art accuracy on many NLP and Natural Language Understanding (NLU) tasks.
A major success factor for BERT in the industry has been that its language processing capabilities can be used to empower other AI models.
Following the BERT implementation, the team at myUpchar then used the TensorFlow library and the Apple Turi Create library for the recommendation engine. Apple’s Turi Create is widely believed to have simplified ML models as a user without machine learning expertise can add recommendations to train the model.
Given that there was quite a lot of data to deal with, the team used MongoDB along with AWS EMR and Spark for querying the data.
Garg said that their model typically takes all the inputs from users into account, including their query, demographic information and medical history. Based on the query, the user is asked additional questions by a bot, which then allows the model to determine a diagnosis.
The human element
This diagnosis is then suggested to the doctor. If the doctor does not accept the inference and provides an alternative diagnosis, the algorithm receives this feedback and retrains the algorithm. Additionally, myUpchar’s medical team also reviews and provides feedback on what caused the misdiagnosis.
“Currently, our algorithm has been tested for certain disease groups and provides an accuracy of 86%. As the accuracy improves further, we will deploy it for additional disease groups thus reducing doctor’s time in the interaction,” said Garg. This, he explained, is because the algorithms will ask very specific questions to come to a suggested diagnosis which doctors can then review.
“Furthermore, in one of our AI models, we scan lab reports to automatically come up with a diagnosis which is then fed into our consultation platform for the doctor to view,” he added. myUpchar also developed a product which allows it to scan chest X-rays for 14 diseases and show that to the doctor during his interaction with the patient.
In closing, Garg observed that the COVID-19 pandemic has brought an opportunity for the healthcare industry to increase the adoption of digital processes and automation.
Continue reading for free
Create your free Insider account or sign in to continue reading. Learn more