TESTING AND MATCHING IN MEDICINE
Medical testing is the process performed in order to diagnose, monitor and detect disease and then establish the course of action. Medical tests are related to molecular diagnostics and clinical chemistry and are usually carried in a medical laboratory.
A variety of tests are advised:
Diagnostic testing is a process performed to substantiate or conclude the presence of disease in a patient suspecting of possessing a disease. Figure 9.Ю shows the diagnosis process.
Following is an example:
- • Whenever a patient is suspected of having lymphoma, the diagnostic test called nuclear medicine is used.
- • To check the bacterial infection in patient with a high fever, a complete blood count is used.
- 9.10.2 Screening
Screening tests are useful in the individuals with an elevated risk for a disease to occur. This test is carried out to manage epidemiology, to monitor disease prevalence, to prevent the disease or for strictly statistical purposes.
Following are some examples:
- • Measuring blood pressure of a patient based on existing data.
- • Measuring blood sugar levels of a patient suspected to be diabetic.
- • Measuring the TSH in the blood of a newborn for congenital hypothyroidism.
FIGURE 9.10 Diagnosis process.
Monitoring is a medical test employed to monitor the advancement or response to a medical treatment.
9.10.4 Precision or Accuracy Aspects
The precision of a test is its correspondence with the exact value. Accuracy of the experiment is its reproducible nature when it is repeated on the same set of the sample.
CLASSIFICATION AND CLUSTERING IN MEDICINE
As we studied in Section 9.2, medical imaging is playing a vital role in the diagnosis of disease. The proper identification or detection of the disease is possible only when you cluster and classify the image data properly. This can be achieved by selecting proper and efficient classification and clustering algorithms .
Clustering is an unsupervised machine learning method. It is a process of grouping the data based on the similarity constraints. Here, the groups are called clusters. The data or objects in a group are different when compared to the objects in another group .
There are four types of clustering:
- 1. Hierarchical clustering
- 2. Partition methods
- 3. Density-based clustering
- 4. K-NN (К-nearest neighbour)
- 184.108.40.206 Hierarchical Clustering
Hierarchical clustering is a technique in which a cluster hierarchy called a dendrogram is built. Each cluster in this technique has child clusters, sibling clusters and its common parent. This technique is further divided into two methods :
a. Agglomerative clustering
b. Divisive clustering
Figure 9.11 shows the hierarchical clustering with dendrogram.
220.127.116.11.7 Agglomerative Clustering
Agglomerative clustering initiates with clusters having a single point and repeatedly unites two or more points, forming suitable clusters. Then, computation of all pairs’ pattern-pattern similarity coefficients takes place. Every pattern is placed in its own class based on similarity. Later, the merging of the two most similar clusters into one new cluster is carried out, followed by recomputation of inter-cluster
FIGURE 9.11 Hierarchical clustering.
similarity scores. This merging of similar clusters is repeated until к-clusters are formed.
18.104.22.168.2 Divisive Clustering
This method is completely the reverse of agglomerative clustering. Here, the clustering begins with one cluster having the entire points in it. Then the cluster is split into two or more appropriate clusters. The splitting starts from the top, having all patterns in one cluster end up with each pattern in its own cluster.
These methods make it easier for decision making by looking at the dendrogram as shown in Figure 9.11. But these are not applicable for large datasets and are very sensitive towards outliers.
22.214.171.124 Partition Methods
These methods divide the entire database into partitions of к-clusters. It generally uses the iterative optimisation mechanism ; that is, iterative reassigning of points between the clusters. These algorithms are classified as:
- 1. K-means
- 2. K-medoids
- 126.96.36.199.1 К-Means Clustering
In к-means clustering, data is partitioned into к number of groups. These groups are disjoint clusters. It consists of two phases. In first phase, the centroid is calculated, and in second phase, points are taken in each cluster which are near to the centroid. This method is most commonly used in health services. For example, people with heart disease are grouped based on their blood pressure and cholesterol levels. К-means clustering is easy to implement and produces tighter clusters than hierarchical clusters. However, it is tricky to expect the number of clusters and it is sensitive to scale.
FIGURE 9.12 Partition methods of clustering (a) К-means clustering (b) K-medoids clustering.
188.8.131.52.2 K-Medoids Clustering
K-medoids is analogous to к-means, but in k-medoids data points are taken as centres, whereas in к-means it is not necessary that the centre of a cluster is data point; refer Figure 9.12. These methods aim at reducing the distance between the data points.
184.108.40.206 Density-Based Clustering
It is a technique where clusters are the dense areas that are separated from one another by sparse areas. Each cluster has two parameters, i.e. epsilon and minimum points, and each cluster has to contain at least a single point. The number of neighbour clusters must be greater than or equal to the minimum data points. Subsequently, the algorithm continues to iterate for the residual data points in the set. It is robust to outliers and there is no necessity to specify the number of clusters as in к-means. But it does not give good results when the differences in densities are large. Figure 9.13 shows density-based clustering.
220.127.116.11 К-Nearest Neighbour
The K-NN classification rule is to assign to a test sample the majority category label of its к nearest training samples. K-NN assumes that all instances are points in some «-dimensional space and defines neighbours in terms of distance (usually Euclidean
FIGURE 9.13 Density-based clustering.
FIGURE 9.14 K-NN clustering.
in R-space). It is a simple technique that is easily implemented, and the cost of the learning process is zero. But it is computationally expensive to find the к-nearest neighbours when the dataset is very large. Figure 9.14 shows the K-NN clustering.
Classification is the final and major step in the process of diagnosis. It is a supervised learning method where the input data is classified using the existing training data. There are many methods by which the given input data can be classified . These methods are mainly divided into three categories.
18.104.22.168 Texture classification
This classification technique is the key component in many medical applications. It aims at assigning a strange sample data to one of the sets of the known texture classes. It belongs to the texture analysis domain.
22.214.171.124 Neural Networks
It is the most commonly used method in artificial intelligence for problem solving. It has three layers: input, hidden and output, as shown in Figure 9.15.
It mainly consists of five basic steps: •
Finding net output at different output layer nodes.
- • Designing an objective function mathematically.
- • Designing the cost function.
- • Minimising the error.
- • Updating the weights.
FIGURE 9.15 Neural networks.
126.96.36.199 Data Mining
Classification using data mining techniques involves the usage of sophisticated tools of data analysis to find out the relationships in large volumes of data. The data mining classification techniques that are used for medical imaging classification are as follows:
i. Modified K-NN: This uses an instance-w'eighing scheme which is based on distance measure. It gives better accuracy compared to other algorithms.
ii. One-class KCPA: This method identifies the data objects of one specific class among all other data objects.
iii. Cascading of one-class KCPA: This algorithm ensures high accuracy. It uses a reject option in order to reduce the cost of misclassification.
Machine learning has changed the entire healthcare system by providing many advantages to the public. It has had a great impact on lifestyles and w'ell-being. It has bridged the gap between patients and doctors, at the same time saving the time and reducing the costs of health services. Its multidisciplinary nature helps in solving health problems using its various disciplines. The improvement of machine learning can lead to the invention of robots that assist doctors during complex surgeries. Mining the data to discover the useful information from the data can also be done, which would enhance improvements in screening and profiling for better drug discovery. This chapter has focused on many new advancements taking place in the field of machine learning that can in turn help in providing better healthcare services to patients.
1. R. Samant, and S. Rao, A study on feature selection methods in medical decision support systems. International Journal of Engineering Research & Technology (IJERT), Vol. 2, Issue 11. November 2013.
- 2. K. Rajeswari, V. Vaithiyanathan, and S. V. Pede, Feature selection for classification in medical data mining, International Journal of Emerging Trends and Technologies in Computer Science. Vol. 2, April 2013.
- 3. G. B. Fauet. and H. C. Davis. Regression analysis in medical research. Southern Medical Journal.
- 4. C. Combes, F. Kadri, and S. Chaabane. Predicting hospital length of stay using regression models: application to emergency department. HAL Id: hal-01081557, https://hal. archives-ouvertes.fr/hal-01081557. Submitted on 9 Nov 2014.
- 5. N. Qamar, Y. Yang, A. Nadas. and Z. Liu, Querying Medical Datasets While Preserving Privacy, Procedia Computer Science, Dec 2016, 98, 324-331, 10.1016/j. procs.2016.09.049.
- 6. L. Diao, H. Yan, F. Li et al.. The research of query expansion based on medical terms reweighting in medical information retrieval, EURASIP Journal on Wireless Communications and Networking, 105,2018.https://doi.org/10.1186/sl 3638-018-1124-3.
- 7. M. Kalekar, and B. Sonawane, A survey on medical image classification techniques. International Journal of Innovative Research in Computer and Communication Engineering, Vol. 5. Issue 7, July 2017.
- 8. Y. K. Alapati. Combining clustering with classification: a technique to improve classification accuracy. International Journal of Computer Science Engineering, Vol. 5. Issue 6. Nov 2016.
- 9. N. S. Nithya. K. Duraiswamy. and P. Gomathy, A survey on clustering techniques in medical diagnosis, International Journal of Computer Science Trends and Technology (IJCST), Vol. 1, Issue 2. Nov-Dec 2013.
- 10. R. Sharma. S. Narayan. and S. Khatri, Data Mining Using Different Classification and Clustering Techniques: A Critical Survey, IEEE, 2016.