Artificial Intelligence Technique for Predicting Type 2 Diabetes


Department of Computer Science and Engineering,

N.M.A.M. Institute of Technology, Nitte, Karnataka 574110, India

'Corresponding author. E-mail: This email address is being protected from spam bots, you need Javascript enabled to view it


Diabetes is the most common disease experienced recently. Type 1 diabetes. Type 2 diabetes, and gestational diabetes are the most common types of diabetes. The aim of this chapter is to predict the Type 2 diabetes with various parameters. “Diabetes risk score or test system” is designed with the various risk factors such as age, waist circumference, physical activity, family history, and body mass index using an artificial intelligence technique. This chapter also aims to design a universally acceptable diabetes prediction system that predicts the possibility of diabetes risk. This process is carried out using various parameters of the patient’s life style and without using the data from medical test results. The individuals who are interested to know about their risk score can use this diabetes risk score system.


hi the present scenario, diabetes is one of the common diseases. Type 1 diabetes occurs when pancreas does not produce insulin. This chapter gives information about Type 2 diabetes (T2D). In T2D, cells cannot utilize glucose proficiently for strength. This happens when the cells end up unfeeling to insulin and the glucose slowly gets excessively high. There are various reasons for causing T2D, which are being overweight, lack of physical movement, stress, genetics, and eating a great deal of sustenance’s or beverages with sugar and straightforward starches. Risk factor may include history of family having the diabetes, being sedentary, being overweight, etc. The major symptoms of T2D are excess thirst, dark skin under armpits, chin, or groin, blurry vision, etc. [1].

The number of people having T2D is increasing. It is a vital factor for death. Several researchers have carried out the experiment on the T2D and proved that the prevention for this disease can be done by lifestyle modification [2].

The aim of this work is to develop the diabetes risk score system with the most used artificial intelligence (AI) methodologies. In the present scenario, AI can be applied in a variety of research areas because of various applications. In this chapter, we introduce the system that will predict the T2D based on different parameters, and it will also inform the patients about the most effected parameter for T2D based on the expert system [3]. The person who is interested to know about his T2D risk score can use this system.


The diabetes risk score system is developed by several researchers.

Mohan et al. [1] developed the Indian Diabetes Risk Score with the help of the Madras Diabetes Research Foundation (MDRF-IDRS) to help recognize the undiscovered T2D mellitus in that population. While developing the MDRF-IDRS, they took 26001 samples from 155 wards in Chennai Urban Rural Epidemiology Study. They built the system with the four simple parameters: age, waist circumference (WC), family history, and physical activity. They derived the score based on the logistic regression method and set the maximum score as 100. The IDRS score <30 is considered as low risk, 30-50 is considered as medium risk, and >60 is considered as high risk. To detect the optimum value (>60), they used the receiver operating characteristic (ROC).

Lindstrom and Tuomilehto [2] developed the practical tool to predict the T2D risk in France. Here, 4500 samples were collected to develop this system considering various categorical variables such as age, gender, food, family history, waist, physical activity, blood pressure (BP), high blood glucose, and body mass index (BMI). Logistic regression was utilized to register p coefficients for known hazard factors for diabetes, p coefficients of the display were utilized to allot a score esteem for every factor, and the composite Diabetes Hazard Score was figured as the aggregate of those scores. While developing this tool at the end, the risk score <7 is considered as the low risk, 7-14 is considered as the moderate risk, 15-20 is considered as the high risk, and >20 is considered as the very high risk.

Katulanda et al. [3] developed the Diabetes Risk Score in Sri Lanka (SLDRISK). To develop the SLDRISK, 4276 samples were collected. Based on the variables such as age, family history, gender, WC, physical activity, and BP, the SLDRISK was developed. To identify the variables, univariate regression analysis is done. To derive the risk score, the p coefficient values are identified using the analysis called logistic regression. In this system for finding the optimal cutoff value, sensitivity, and specificity, ROC analysis is done. The authors also validate the SLDRISK with IDRS and Cambridge Risk Score (CRS). They concluded that, in the SLDRISK, sensitivity is 77. 9% and specificity is 65.6%, which are higher than those in the IDRS and CRS.

AI-Lawati and Tuomilehto [4] proposed the Diabetes Risk Score in Oman. They developed the diabetes risk score system for identifying the diabetes mellitus with 4881 samples. Here, the logistic regression method was used with different parameters such as age, gender, family history, WC, BP, BMI, and smoking. They have concluded that when the age, WC, and BMI increase, the probability of getting T2D is high. Age and family history are the strongest predictors, whereas BMI, BP, and WC are moderate parameters. The Oman risk score system is validated with the Nizwa survey, which contains same parameters, with 1432 samples, in which 145 had diabetes.

Griffin et al. [5] developed the diabetes risk score system for Cambridge. Collected the data of 1077 people of the range 40-64 years. Information was gone into a regression model. Here, specificity is 72%, sensitivity is 77%, and ROC is up to 80%.

Zhou et al. [6] developed the risk score system for T2D mellitus for the Chinese population. Here, they took 5453 samples. This system was developed based on the lifestyle and other factors such as gender, age, physical activity, family history, WC, history of dyslipidemia, diastolic BP, and BMI. They took the cutoff value as 17 with 67.9% sensitivity and 67.8% specificity. They validated the developed system with the American Diabetes Association Score (0.636), Inter99 Score (0.669), and Oman Score (0.675). It was concluded that using the area under the ROC curve (AUC), the Chinese Diabetes Risk Score is 0.723, which is higher compared to other systems.

Bang et al. [7] created and approved a patient self-evaluation diabetes screening score for US grown-ups. In National Health and Nutrition Examination Survey (in Atherosclerosis Risk in Communities/Cardiovascular Health Study, 30(40)% of people for diabetes screening yielded affectability of 79(72)%, specificity of 67(62)%, constructive prescient estimation of

10(10)%, and probability proportion constructive of 2.39 (1.89). Conversely, the examination scores yielded affectability of 44%-100%, particularity of 10%-73%, positive prescient estimation of 5%-8%, and probability proportion positive of 1.11-1.98. This new diabetes screening score, basic and effectively executed, appears to show enhancements upon the current strategies. Future examinations are expected to assess it in diverse populaces in certifiable settings.

Rigla et al. [8] proposed the AI technique that helps detect the diabetes. The AI technique is widely used in variety of applications. Based on the varieties of abilities such as learning and reasoning, it can be applied in predicting the diabetes risk score. Here, various AI techniques such as data mining, fuzzification, defuzzification, support vector machine (SVM), heuristic approach, hybrid systems, naive Bayes, supervised learning, and unsupervised learning are explained.

Chen et al. [9] developed the risk assessment tool for Australian called AUSDRISK. According to the survey, they told that by 2025, the people with diabetes will really reach 2 million. To prevent this, lifestyle should be improved. They took a sample of 6060 people from the five-year corresponding data. Based on this, they predicted and manipulated the risk score.

Unwin et al. [10] gave important messages about diabetes: diabetes is a colossal and developing issue, and the expenses to society are high and heightening; diabetes is a dismissed advancement issue, influencing all nations; there are financially savvy answers for switching the worldwide diabetes scourge; and diabetes is not just a medical problem, its causes are multisectoral, and it requires a multisectoral.

Observations: Diabetes risk score systems of different countries are studied and various parameters considered in these systems are observed as shown in Table 6.1.


The motivation behind this work is to detect the T2D of individuals who are interested to know about their risk score. Therefore, the diabetes risk score system is designed without any laboratory tests. The objectives are as follows.

  • • Different diabetes risk score systems are studied to understand the parameters that are being used in risk estimation.
  • • Build a dataset using the parameters that are being used for the prediction.














High Blood Glucose







Sri Lanka



















  • • Apply the suitable machine learning algorithm and find the score on this designed dataset (with some parameter).
  • • The built dataset should represent all the scoring systems existing and should be able to represent people from around the world.
  • • Add more features to the designed system and again calculate the score and compare it with the original system. (How the original system can be fine-tuned if we add some other feature?).
  • • Validate the data using the existing diabetes risk score system.
< Prev   CONTENTS   Source   Next >