RESULTS AND DISCUSSION
6.6.1 FOR ASIAN COUNTRIES
Three combinations of parameters are used for prediction using different algorithms for different Asian countries such that accuracy, precision, sensitivity, and specificity are calculated. Sensitivity and specificity rates
FIGURE 6.3 Workflow of the consensusbased final prediction.
are needed to draw the ROC. Three combinations of parameters were used for prediction: Combination 1: age, gender, physical activity, family history, and WC; Combination 2: age, gender, physical activity, family history, WC, and BMI; and Combination 3: age, gender, physical activity, family history, and BMI. While developing the model with logistic regression, initially, Xand Y values are defined. X is the matrix that contains the attributes from the dataset and Y is the vector, based on which prediction can be done. X is defined as the corresponding p values of age, waist, physical activity, family history, and BMI. Y is the vector based on the outcome. Once the X and Y values are defined, split the X and Y values into corresponding training set and testing set. Here, sklearn splitting is done such that the random state value is set as zero. Once the classification of training set and testing set are over in the next stage, on the training set, we have to train a logistic regression model and fit the model on the X_train and T train. Once the model is fit, prediction based on the testing set, that is,X_test, should be carried out and calculate the accuracy such that in this dataset, the accuracy for the combinations 13 for the Asian countries is up to 0.8505, 0.9779, and 0.960, respectively. Once the model is built, now, the confusion matrix is created such that it will give the number of real and false prediction in the form of array. The confusion matrix for our dataset for combination 2 of Asian countries is shown in Figure 6.4, which indicates that the dimension is 2x2. For example, in the Indian system, real prediction values are 3153 and 122,587 (diagonal values) and inaccurate prediction values are 1846 and 1010. Similarly, the confusion matrix for different countries is calculated, which is fully used for the prediction.
Therefore, from the logistic regression model, the classification rate, precision, recall, sensitivity, and specificity are shown in Table 6.4 for combination 2. Similarly, values are identified for remaining combination. True negative rate is determined by the specificity, which defines the percentage of patients who are correctly identified as being healthy, so using a logistic regression model for combination 2, it is almost up to 63.07%, 70.94%, 69.38%, and 80.18%, respectively. True positive rate is determined by the sensitivity, which defines the percentage of patients who are correctly identified as being disease so using the logistic regression model for combination 2, it is almost up to 99.18%, 98.64%, 98.74%, and 98.54%, respectively, for the Asian countries. Therefore, similarly, these values are identified using four different algorithms for Asian countries, as shown in Tables 6.56.7.
FIGURE 6.4 Visualizing the confusion matrix.
TABLE 6.4 Accuracy, Sensitivity, and Specificity Percentage Using Machine Learning Algorithms
Logistic Regression 
Accuracy (O/O) 
Precision (0/0) 
Recall (o/o) 
Sensitivity (0/0) 
Specificity (0/0) 
India 
97.78 
98.51 
99.1 
99.18 
63.07 
China 
96.98 
98.14 
98.6 
98.64 
70.94 
Sri Lanka 
97.08 
98.18 
98.7 
98.74 
69.38 
Oman 
97.175 
98.41 
98.5 
98.54 
80.18 
The optimum value (>95%) is considered as the high risk score for the diabetes that is detected based on the ROCs for India, China, Sri Lanka, and Oman, as shown in Figure 6.5.
TABLE 6.5 Combination 1 (Without Considering BMI)
Accuracy 
Precision 
Sensitivity 
Specificity 

India 

Logistic Regression 
85.05 
87.40 
91.39 
71.17 
Gaussian Bayes Model 
78.4 
78 
79.0 
77.0 
Random Forest 
71.1 
69.5 
71.9 
69.3 
Decision Tree 
72.7 
67 
67.9 
67.8 
China 

Logistic Regression 
93.98 
92.97 
72.17 
70.81 
Gaussian Bayes Model 
95.30 
96 
96.0 
95.0 
Random Forest 
96 
95 
95.0 
92.0 
Decision Tree 
89.2 
85.5 
70.2 
65.8 
Sri Lanka 

Logistic Regression 
94.37 
94.28 
74.19 
61.53 
Gaussian Bayes Model 
94.50 
89 
94 
92 
Random Forest 
94.0 
88.2 
94.8 
91.7 
Decision Tree 
92.0 
77.7 
70.3 
65.6 
Oman 

Logistic Regression 
92.55 
90.19 
73.18 
69.72 
Gaussian Bayes Model 
92.62 
86.7 
93.0 
89.9 
Random Forest 
93.0 
86 
93.0 
89.0 
Decision Tree 
92.0 
79.0 
79.0 
66.1 
The Indian system is crossvalidated using AUC and its value is 0.9873. The China system is crossvalidated using AUC and its value is 0.9870. The Sri Lanka system is crossvalidated using AUC and its value is 0.9879. The Oman system is crossvalidated using AUC and its value is 0.9883.
There might be unpredictable and unknown connections between the factors in the dataset. It is critical to find and evaluate how many factors in the dataset are dependent on one another. This information can enable to more readily set up the information to meet the desires for machine learning calculations. Factors inside a dataset can be connected for a number of reasons. Relation could be true, neutral, or zero depends on the movement of the two variables. Association can similarly be neural or zero, inferring that the variables are insignificant. Relation between different features for the Indian system is shown in Figure 6.6. The graph is plotted according to the pair using the correlation feature, as shown in Figure 6.7. A similar approach is applied to all the remaining T2D systems of Asian countries and then analyzed.
Accuracy 
Precision 
Sensitivity 
Specificity 

India 

Logistic Regression 
97.78 
98.51 
99.18 
63.07 
Gaussian Bayes Model 
96.06 
92.00 
96.12 
94.3 
Random Forest 
96.0 
92.0 
96.0 
94.0 
Decision Tree 
93.7 
98.1 
74.8 
72.9 
China 

Logistic Regression 
96.98 
98.14 
98.64 
70.94 
Gaussian Bayes Model 
94.7 
89.23 
94.80 
91.0 
Random Forest 
94.0 
89.0 
94.0 
91.0 
Decision Tree 
94.0 
90.01 
75.7 
56.0 
Sri Lanka 

Logistic Regression 
97.08 
98.18 
98.74 
69.38 
Gaussian Bayes Model 
97.01 
97.0 
97.0 
97.0 
Random Forest 
99.9 
99.8 
98.7 
98 
Decision Tree 
96.7 
97.7 
66.2 
64.0 
Oman 

Logistic Regression 
97.175 
98.141 
98.54 
80.18 
Gaussian Bayes Model 
95.90 
96.28 
96.0 
95.0 
Random Forest 
96.0 
96.0 
96.0 
95.0 
Decision Tree 
92.14 
72.47 
76.05 
78.20 
TABLE 6.7 Combination 3 (Without Considering WC)
Accuracy 
Precision 
Sensitivity 
Specificity 

India 

Logistic Regression 
96.0 
95.8 
96.2 
94.5 
Gaussian Bayes Model 
96.24 
96.0 
96.0 
94.0 
Random Forest 
99.0 
99.0 
99.2 
98.0 
Decision Tree 
92.5 
95.0 
84.0 
78.0 
China 

Logistic Regression 
96.24 
97.51 
98.52 
60.74 
Gaussian Bayes Model 
95.50 
95.25 
96.0 
94.0 
Random Forest 
98.0 
98.0 
98.0 
98.0 
Decision Tree 
96.0 
96.0 
96.0 
94.0 
Sri Lanka 

Logistic Regression 
96.52 
97.78 
98.46 
64.07 
Gaussian Bayes Model 
95.47 
92.5 
91.0 
89.5 
Random Forest 
98.0 
97.8 
96.4 
92.2 
Decision Tree 
80.0 
93.0 
84.0 
79.0 
Oman 

Logistic Regression 
85.8 
91.5 
85.8 
81.0 
Gaussian Bayes Model 
94.48 
94.0 
94.0 
93.0 
Random Forest 
98.0 
98.0 
98.0 
98.0 
Decision Tree 
95.0 
92.0 
90.15 
89.7 
FIGURE 6.5 Optimum value using ROC.
FIGURE 6.6 Correlation between the features for the Indian system.
FIGURE 6.7 Pair plot according to correlation feature values.
Using the ROC curve, the cutoff value for the risk score can be identified. The ROC curves were plotted for the diabetes risk score, the sensitivity was plotted on the vaxis, and the false positive rate (1  specificity) was plotted on the .vaxis. The more precise segregating the test, the more extreme the upward part of the ROC bend and the higher the zone under the bend (AUC). The optimum value is considered as the high risk score for the diabetes that is detected based on ROCs. ROC curves using different combinations for the Indian system are demonstrated in Figure 6.8. Therefore, it is observed that logistic regression performs better compared to all other algorithms. Since the AUC is constantly used to identify how well the test is performed between the two gatherings like if the value of AUC increases, which indicates, the better is the test. Therefore, the Indian system is validated using the AUC, and its value is 0.98. Similarly, China, Sri Lanka, and Oman systems are also validated, and the obtained results are 0.98, 0.97, and 0.94, respectively. A similar approach is applied for remaining countries.
FIGURE 6.8 ROC curves showing the performance of the diabetes risk score in predicting diabetes.
Similarly, about 514,384 samples are used for analysis. Optimization of diabetes data using /гnearest neighbors (KNN) classifier, DT, RF, SVM, and GB model is compared as shown in Table 6.8.
TABLE 6.8 Analysis of Optimization of Different Machine Learning Algorithms
Machine Learning Algorithm 
Accuracy— India 
Accuracy— China 
Accuracy— Sri Lanka 
Accuracy— Oman 
KNN Classifier 
Training set: 1.00 Test set: 1.00 
Training set: 1.00 Test set: 1.00 
Training set: 1.00 Test set: 1.00 
Training set: 1.00 Test set: 1.00 
Decision Tree 
Training set: 1.00 Test set: 1.00 
Training set: 1.00 Test set: 1.00 
Training set: 1.00 Test set: 1.00 
Training set: 1.00 Test set: 1.00 
Random Forest 
Training set: 1.00 Test set: 1.00 
Training set: 1.00 Test set: 1.00 
Training set: 0.94 Test set: 0.94 
Training set: 0.93 Test set: 0.93 
Support Vector Machine 
Training set: 0.99 Test set: 0.99 
Training set: 0.95 Test set: 0.95 
Training set: 0.965 Test set: 0.965 
Training set: 0.95 Test set: 0.95 
Gaussian Bayes Model 
Training set: 0.9611 Test set: 0.9608 
Training set: 0.9404 Test set: 0.9414 
Training set: 0.9443 Test set: 0.9450 
Training set: 0.9261 Test set: 0.9262 
By considering the simple five parameters, namely, age, family history of diabetes, WC, physical activity, and BMI, the system was developed. Quickly, the data for these factors were obtained by five inquiries and scores acquired for these elements, as shown in Table 6.9.
6.6.2 FOR EUROPEAN COUNTRIES
Similarly to explained in Section 6.4.1, here, three combinations of parameters are used for prediction using different algorithms for different European countries such that accuracy, precision, sensitivity, and specificity are calculated. Sensitivity and specificity rates are needed to draw the ROC. The confusion matrix for the European dataset for combination 2 is shown in Figure 6.9.
Therefore, from the logistic regression model, the classification rate, precision, recall, sensitivity, and specificity are shown in Table 6.10 for combination 2. Similarly, the values are identified for remaining combinations. True negative rate is determined by the specificity; therefore, using the logistic regression model for combination 2, it is almost up to 55.70%, 82.55%, 56.62%, and 55.83%, respectively. True positive rate is determined by the sensitivity, which defines the percentage of patients who are correctly identified as being disease; therefore, using the logistic regression model for combination 2, it is almost up to 97.70%, 90.35%, 93.07%, and 91.98%, respectively, for European countries. Therefore, similarly, these values are identified using four different algorithms for European countries, as shown in Tables 6.116.13.
TABLE 6.9 Diabetes Risk Score for Asian Countries
Parameters 
Risk Score 
Age 

<35 
0 
3549 
22 
>50 
34 
Obesity 

Waist Circumference Female <80 cm, Male <90 cm 
0 
Female 8089 cm. Male 9099 cm 
11 
Female >90 cm. Male >100 cm 
20 
Physical Activity 

Vigorous Exercise 
0 
Mild Exercise 
13 
No Exercise 
18 
Family History 

Two Nondiabetic Parents 
0 
Either Parent Having Diabetes 
18 
Both Parent Having Diabetes 
29 
BMI 

<25 
0 
2529 
11 
3034 
16 
>35 
29 
Maximum Score 
130 
Score >95: Very High Risk, 7095:High Risk, 3569: Medium Risk, <35: Low Risk
FIGURE 6.9 Visualizing the confusion matrix.
TABLE 6.10 Accuracy, Sensitivity, and Specificity Percentage Using Machine Learning Algorithms
Logistic regression 
Accuracy (O/o) ' 
Precision (O/O) 
Recall («/о) 
Sensitivity (O/o) 
Specificity (O/O) 
Cambridge 
84.42 
87.36 
93.13 
97.70 
55.70 
Fiance 
87.58 
90.36 
90.35 
90.35 
82.55 
UK 
84.52 
87.49 
93.07 
93.07 
56.62 
Danish 
82.69 
85.76 
91.98 
91.98 
55.83 
The optimum value (80%) is considered as the high risk score for the diabetes that is detected based on the ROCs for Cambridge, France, UK, and Danish, as shown in Figure 6.10.
FIGURE 6.10 Optimum value using ROC.
TABLE 6.11 Combination 1 (Without Considering BMI)
Accuracy (%) Precision (%) Sensitivity (%) 
Specificity (%) 

Cambridge 

Logistic Regression 
78.87 
82.06 
97.89 
92.66 
Gaussian Bayes Model 
83.54 
83 
84 
81 
Random Forest 
78 
75 
78 
75 
Decision Tree 
72 
80 
72 
74 
France 

Logistic Regression 
73.94 
78.76 
81.55 
60.17 
Gaussian Bayes Model 
71.3 
73 
71 
72 
Random Forest 
77 
74 
76 
73 
Decision Tree 
70 
75 
70 
71 
UK 

Logistic Regression 
79.05 
82.15 
97.75 
92.77 
Gaussian Bayes Model 
84 
84 
84 
82 
Random Forest 
78 
75 
78 
75 
Decision Tree 
80 
73 
75 
73 
Danish 

Logistic Regression 
76.53 
80.38 
97.13 
90.48 
Gaussian Bayes Model 
80.81 
81 
81 
78 
Random Forest 
75 
72 
75 
72 
Decision Tree 
76 
70 
72 
70 
Accuracy (%) 
Precision (%) 
Sensitivity (%) Specificity (%) 

Cambridge 

Logistic Regression 
84.42 
87.36 
97.70 
55.70 
Gaussian Bayes Model 
77.4 
74 
77 
70 
Random Forest 
86.5 
90 
86 
87 
Decision Tree 
65.5 
74 
64 
66 
France 

Logistic Regression 
87.58 
90.36 
90.35 
82.55 
Gaussian Bayes Model 
67.50 
66 
67 
61 
Random Forest 
80 
87 
80 
80 
Decision Tree 
64 
65.8 
65.3 
62 
UK 
84.52 
87.49 
93.07 
56.62 
Logistic Regression 
84.52 
87.49 
93.07 
56.62 
Gaussian Bayes Model 
77.45 
75 
77 
70 
Random Forest 
86.5 
90 
86 
87 
Decision Tree 
76 
67 
76 
68 
Danish 

Logistic Regression 
82.69 
85.76 
91.98 
55.83 
Gaussian Bayes Model 
75.3 
73 
75 
68 
Random Forest 
84.5 
89 
84 
85 
Decision Tree 
74 
65 
74 
64 
TABLE 6.13 Combination 3 (Without Considering WC)
Accuracy (%) 
Precision (%) 
Sensitivity (%) 
Specificity (%) 

Cambridge 

Logistic Regression 
82.09 
84.84 
97.55 
93.27 
Gaussian Bayes Model 
83.16 
84 
83 
80 
Random Forest 
87 
86 
86 
86 
Decision Tree 
77 
72 
77 
71 
France Logistic Regression 
87.58 
90.36 
90.35 
82.55 
Gaussian Bayes Model 
71.40 
73 
71 
72 
Random Forest 
69 
70 
69 
69 
Decision Tree 
70 
75 
70 
71 
UK 

Logistic Regression 
82.01 
85.00 
92.88 
86.32 
Gaussian Bayes Model 
81.90 
83 
82 
78 
Random Forest 
88 
86 
86 
86 
Decision Tree 
76 
71 
76 
71 
Danish 

Logistic Regression 
81.46 
84.53 
91.86 
86.54 
Gaussian Bayes Model 
80.05 
80 
80 
76 
Random Forest 
90 
89 
89 
89 
Decision Tree 
75 
71 
75 
71 
Relation between different features for the Cambridge system is shown in Figure 6.11. The graph is plotted according to the pah using the correlation feature, as shown in Figure 6.12. A similar approach is applied to all the remaining T2D systems of European countries and then analyzed.
FIGURE 6.11 Correlation between the features for the Cambridge system.
The ROC curve is drawn for European countries with different algorithms, which is fully used to identify the risk score, as demonstrated in Figure 6.13.
The Cambridge system is crossvalidated using AUC and its value is 0.8829. The France system is crossvalidated using AUC and its value is 0.9408. The UK system is crossvalidated using AUC and its value is 0.8962. The Danish system is crossvalidated using AUC and its value is 0.8848.
Similarly, about 514,384 samples are used for analysis. Optimization of diabetes data using KNN classifier, DT, RF, SVM, and GB model is compared, as shown in Table 6.14.
FIGURE 6.12 Pair plot according to correlation feature values.
FIGURE 6.13 ROC curves showing the performance of the diabetes risk score in predicting diabetes.
Machine Learning Algorithm 
Accuracy (Cambridge) 
Accuracy (France) 
Accuracy (UK) 
Accuracy (Danish) 
KNN classifier 
Training set: 0.99 Test set: 0.98 
Training set: 1.00 Test set:1.00 
Training set: 0.99 Test set: 0.99 
Training set: 1.00 Test set: 0.99 
Decision tr ee 
Training set: 1.00 Test set: 1.00 
Training set: 1.00 Test set: 1.00 
Training set: 1.00 Test set: 1.00 
Training set: 1.00 Test set: 1.00 
Random forest 
Training set: 1.00 Test set: 1.00 
Training set: 1.00 Test set:1.00 
Training set: 0.87 Test set: 0.86 
Training set: 0.827 Test set: 0.827 
Support vector machine 
Training set: 0.87 Test set: 0.86 
Training set:0.875 Test set: 0.876 
Training set: 0.965 Test set: 0.965 
Training set: 0.95 Test set: 0.95 
Gaussian Bayes model 
Training set: 0.7744 Test set: 0.7738 
Training set: 0.6760 Test set: 0.6745 
Training set: 0.7745 Test set: 0.7748 
Training set: 0.7526 Test set: 0.7534 
By considering the simple five parameters, namely, age, family history of diabetes, WC, physical activity, and BMI, the system was developed. Quickly, the data for these factors were obtained by five inquiries and scores acquired for these elements, as shown in Table 6.15.
TABLE 6.15 Diabetes Risk Score for European Countries
Parameters 
Risk Score 
Age 

<35 
0 
35—49 
15 
>50 
23 
Obesity 

Waist Circumference Female <80 cm, Male <90 
0 
cm 
17 
Female 8089 cm. Male 9099 cm 
20 
Female >90 cm, Male >100 cm 

Physical Activity 

Vigorous Exercise 
0 
Mild Exercise 
8 
No Exercise 
12 
Family History 

Two Nondiabetic Parents 
0 
Either Parent Having Diabetes 
19 
Both Parents Having Diabetes 
26 
ВШ 

<25 
0 
2529 
11 
3034 
15 
>35 
19 
Maximum Score 
100 
Score >70:Very High Risk. 5169: High Risk, 3050: Medium Risk, <30: Low Risk
From this work, it is observed that performance of the system is better using logistic regression compared to other machine learning algorithms for both Asian and European countries. The corresponding bar chart of accuracy calculation using logistic regression for Asian and European countries is shown in Figures 6.14 and 6.15. For India, China, Sri Lanka, and Oman, the accuracy is 97.78%, 96.98%, 97.08%, and 97.175%, respectively. Similarly, for Cambridge, France, UK, and Danish, the accuracy is 84.42%, 87.58%, 84.52%, and 82.69%, respectively.
FIGURE 6.14 Accuracy using logistic regr ession for Asian countries.
FIGURE 6.15 Accuracy using logistic regression for European countries.
CONCLUSION AND SCOPE OF FUTURE WORK
A simple diabetes risk assessment tool is developed and validated. A diabetes risk score system is developed for Asian and European countries using five different parameters, namely, age, WC, family history, physical activity, and BMI.
Several risk score tools are developed, but predicting the correct risk score without losing simplicity is really a challenging task. The proposed system is compared with the existing risk score system for the accuracy and performance. It can also be applied to the different ethnic groups. From this work, it is observed that performance of the system is better using logistic regression compared to other machine learning algorithms. In conclusion, the diabetes risk score system is developed that can be used in a stepwise screening strategy for T2D to provide individual agespecific personalized T2D risk score, using p coefficient for each year instead of making an age group. From Tables 6.9 and 6.15, it is concluded that score with <35 for Asian countries and <30 for European countries is considered as low risk, 3569 for Asian countries and 3050 for European countries is considered as medium risk, 7095 for Asian countries and 5169 for European countries is considered as high risk, and finally >95 for Asian countries and >70 for European countries is considered as very high risk. Optimum values >95 for Asian countries and >80 for European countries are identified using the ROC, and these values are validated using AUCs. This tool also provides the information about which factor affects T2D more. In future, the precautionary measure for this T2D will be provided by the expert panel.
KEYWORDS
 • type 2 diabetes
 • risk score
 • artificial intelligence
REFERENCES
1. Л'. Mohan, R. Deepa. M. Deepa, S. Somannavar, and M. Datta, "A simplified Indian Diabetes Risk Score for screening for undiagnosed diabetic subjects,” J. Assoc. Phys. India, vol. 53, pp. 759763, Sep. 2005.
 2. J. Lindstrom and J. Tuomilehto, “The diabetes risk score: A practical tool to predict type 2 diabetes risk,” Diabetes Care, vol. 26, no. 3, pp. 725731, Mar. 2003.
 3. P. Katulanda, N. R. Hill, I. Stratton, R. Sheriff, S. D. N. De Silva, and D. R. Matthews, “Development and validation of a Diabetes Risk Score for screening undiagnosed diabetes in Sri Lanka (SLDRISK),” BMC Endocr. Disord.,vol. 16, no. 1, Jul. 25, 2016, Art. no. 42.
 4. J. AlLawati and J. Tuomilehto, “Diabetes risk score in Oman: A tool to identify prevalent type 2 diabetes among Arabs of the Middle East,” Diabetes Res. Clin. Pract., vol. 77, no. 3, pp. 438444, 2007.
 5. S. J. Griffin, P. S. Little, C. N. Hales, A. L. Kinmonth, andN. J. Wareham, “Diabetes risk score: Towards earlier detection of type 2 diabetes in general practice.” Diabetes Metab. Res. Rev., vol. 16, no. 3, pp. 164171, 2000.
 6. H. Zhou, Y. Li. X. Liu, F. Xu, L. Li, K. Yang, X. Qian. R. Liu. R. Bie, and C. Wang, “Development and evaluation of a risk score for type 2 diabetes mellitus among middle aged Chinese rural population based on the RuralDiab Study,” Sci. Rep., vol. 7, Feb. 17, 2017, Ait. no. 42685.
 7. H. Bang. A. M. Edwards, A. S. Bomback, С. M. Ballautyne, D. Brillon. M. A. Callahan, S. M. Teutsch. A. I. Mushlin, and L. M. Kem. “A patient self assessment diabetes screening score: Development, validation, and comparison to other diabetes risk assessment scores,” Ann. Internal Med., vol. 151. no. 11, pp. 775783, 2009.
 8. M. Rigla, G. GarciaSaez, B. Pons, and M. E. Hernando. “Artificial intelligence methodologies and their application to diabetes,” J. Diabetes Sci. Techno!., vol. 12, no. 2, pp. 303310, 2018.
 9. L. Chen, D. J. Magliano, B. Balkau, S. Colagiuri. P. Z. Zimmet. A. M. Tonkin, P. Mitchell, P. J. Phillips, and J. E. Shaw, “AUSDRISK: An Australian Type 2 diabetes risk assessment tool based on demographic, lifestyle and simple anthropometric measures,” Med. J. Aust., vol. 192, no. 4, pp. 197202, 2010.
 10. N. Unwin, D. Whiting, L. Guariguata, G. Ghyoot, and D. Gan. Diabetes Atlas, 5th ed. Brussels, Belgium: International Diabetes Federation, 2011.
 11. L. J. Gray, N. A. Taub, K. Khunti. E. Gardiner, S. Hiles. D. R. Webb, В. T. Srinivasan, and M. J. Davies, “The Leicester risk assessment score for detecting undiagnosed Type 2 diabetes and unpaired glucose regulation for use in a multiethnic UK setting,” Diabet Med., vol. 27, no. 8, pp. 887895, 2010.
 12. C. Gltimer, B. Carstensen, A. Saudbaek, T. Lauritzen. T. Jorgensen, and K. Borch Johuseu, “A Danish diabetes risk score for targeted screening: The Inter99 study,” Diabetes Care, vol. 27, no. 3, pp. 727733, 2004.
 13. [Online.] Available: https://www. medicinenet. com type_2_diabetes/article. htm#what_ medicatious_treat_type_2_ diabetes
 14. A. G. P. de Sousa. A. C. Pereira, G. F. Marquezine, R. M. do NascimentoNeto, S. N. Freitas, R. L. de C. Nicolato, G. L. L. MachadoCoelho, S. L. Rodrigues, J. G. Mill, and J. E. Krieger,” Derivation and external validation of a simple prediction model for the diagnosis of type 2 diabetes mellitus in the Brazilian urban population,” Eur. J. Epidemiol, vol. 24, no. 2, pp. 101109, 2009.
 15. P. Adhikari, R. Pathak, and S. Kotian, “Validation of the MDRF—Indian Diabetes Risk Score (IDRS) in another south Indian population through the Boloor Diabetes Study (BDS),” J. Assoc. Physicians India, vol. 58, pp. 434436, 2010.
16. Y. Y. Shanbhogue, S. Yidyasagar. M. Madken. M. Varma, С. K. Prashant, P. Seth, and
K. S. Natraj, “Indian Diabetic Risk Score and its utility in steroid induced diabetes,” J. Assoc. Physicians India, vol. 58, 2010, Ait. no. 202.
 17. К. M. Sharma, H. Raujaui. H. Nguyen, S. Shetty, M. Datta, and К. M. Narayan, “Indian Diabetes Risk Score helps to distinguish type 2 from nontype 2 Diabetes Mellitus (GDRC3),” J. Diabetes Sci. Technol., vol. 5. pp. 419425, 2011.
 18. M. J. Kim, N. K. Lim, S. J. Choi, and H. Y. Park. “Hypertension is an independent risk factor for type 2 diabetes: The Korean genome and epidemiology study,” Hypertens Res., vol. 38, pp. 783789, 2015.
 19. A. S. Shera, F. Jawad, and A. Maqsood, “Prevalence of diabetes in Pakistan.” Diabetes Res. Clin. Pract., vol. 76, no. 2, pp. 219222, 2007.
 20. M. I. Schmidt, В. B. Duncan, H. Bang, et al„ “Identifying individuals at high risk for diabetes: The atherosclerosis risk in communities study,” Diabetes Care, vol. 28. pp. 20132018,2005.
 21. I. Contreras and J. Yehi, “Artificial intelligence for diabetes management and decision support,” J. Med. Internet Res., vol. 20, no. 5, Art. no. el0775, 2018.
 22. J. J. Nalluri, D. Barh, V. Azevedo, and P. Ghosh, "miRsig: A consensus based network inference methodology to identify pancancer miRNAmiRNA interaction signatures,” Sci. Rep., vol. 7, 2017, Ait. no. 39684.
 23. N. Razavian, S. Blecker, A. M. Schmidt, A. SmithMcLallen. S. Nigam. and D. Sontag. “PopulationLevel prediction of type 2 diabetes from claims data and analysis of risk factors,” Big Data, vol. 3, no. 4, 2015.