Cloud Computing Based Intelligent Healthcare System
- Building an intelligent healthcare system
- Early detection and prediction of brain tumor using Intel ligent Cloud
- Classification using different models
- Image Inversion
- Experiments and Results
- Naive Bayes Classifier Model
- CNN Model
- Image Inversion
- Summary and Discussion
- Research Challenges and possible solutions
Department of Computer Science and Engineering, University at Buffalo, State University of New York (SUNY)
The healthcare industry nowadays is so much more than just health of its patients. The need for a reliable, scalable and cost effective IT infrastructure to support growing demand for clinical, administrative as well as financial functions is leading to fast adoption of cloud technology in the healthcare organizations. Coupled with machine learning technology, it can be turned into an intelligent healthcare system. Machine learning is the next level of evolution in automation where machines learn from various data, eliminating the need for human intervention. It has the ability to process information already present beyond the capability of a human mind and, then, reliably analyze these data in improving decisions about patient diagnoses and finding more efficient treatment options. Combining these technologies, we get the intelligent cloud. It can learn from the vast amount of data already stored in the cloud to analyze difficult situations, make predictions and suggest efficient and convenient healthcare solutions. This can benefit in advancement of both the fields. As one of the IBM article states,
"Digital transformation has become an ongoing process rather than a one-time goal, with market-attuned companies continually on the hunt for the next big technology shift that gives them a competitive advantage. That next big shift is the fusion of artificial intelligence and cloud computing, which promises to be both a source of innovation and a means to accelerate change." The intelligent cloud is not to replace the doctors or the hospitals; it will rather assist the current healthcare technology to decide the effective ways of treatment. But what if it behaves differently and gives us wrong results? Can we afford such risks in healthcare? Here comes the role of interpretable machine learning. It deals with building trust in a machine learning model. The system being equipped with such advanced technologies, i.e., cloud computing and machine learning, also needs to be reliable, and interpretable machine learning provides us with trustworthy models. One of the important concerns in field of medical science is handling early detection and prediction of brain tumor in humans. For achieving this, detection and effective classification of types of tumors is necessary. The conventional method for the above issue includes getting images through MRI and then inspection by radiologists to identify the specific characteristics of the images produced. With such methods, handling large data and the requirement for reproducing any specific type of tumor are highly impractical, which is why there is a need for an intelligent cloud to get desired results.
Cloud computing technologies are being widely used in the healthcare industry in numerous ways. International Data Corporation (IDC) estimated healthcare providers will account for 48% of total spending on industry cloud in 2018- 2019. With growing challenges in healthcare, requirement for reliable, flexible or scalable hardware infrastructure has been continuously increasing and cloud computing is a one stop solution to all these requirements. Be it the healthcare apps, the medical equipment, the hospitals, the pharmaceutical companies, the insurance companies, the universities, or the healthcare platforms, all these companies take advantage of the cloud for various needs.
The data that flows in such industries is very crucial as it includes a patient's personal information as well as their medical histories. Protecting those data as well as making efficient use of these data cannot be achieved through cloud computing alone. Machine learning or data analytics play a very important role here in creating intelligent systems that will help humans achieve efficient healthcare solutions. Thus, cloud computing and machine learning can be used together, known as intelligent cloud, to discover new and effective ways of treatment. Not only in treatment, but also all the novel services mentioned in the previous para?graph can use this advanced technology to increase the ease of usage of such applications.
Even though security is provided by different cloud providers like Amazon Web Services (AWS), Azure, etc., trusting the predictions or solutions provided by a system is also required. Interpretable machine learning can help achieve the same by building trust in such kinds of intelligent cloud systems. In healthcare, it is very important to have reliable and trustworthy models that can help in creating not only innovative solutions but also the right solutions. The solutions that we get in such industries will have direct impact on the well-being of a person. This fact is threatening enough for us to make sure the solution is absolutely secure and trustworthy. For example, we will be discussing about handling early detection and prediction of brain tumor in humans and how we can achieve that, through the combination of these growing technologies.
For effective treatment of brain tumor, it is very important to detect and classify the types of tumors. The conventional method for the above issue includes getting images through Magnetic Resonance Imaging (MRI) and then inspection by radiologists is necessary to identify the specific characteristics of the images produced. With such methods, handling large data and the requirement for reproducing any specific type of tumor is highly impractical which is why there is a need to take aid of mathematical or computational models to handle such classifications.
Building an intelligent healthcare system
A huge number of factors come into picture when we are trying to build a product that will have more than one advanced technology, and understanding the need for such a system is very important. I will be discussing few advantages of building an intelligent healthcare system in the following paragraphs:
- 1. Scalability: With growing demand for healthcare needs and the dynamic nature of the needs in this industry, scalability is a major concern here. Due to the flexible nature of cloud computing technology, providing a scalable infrastructure will not be an issue.
- 2. Reliability: Considering the sensitivity of the data flowing in this industry, it is important to have a reliable system. Cloud computing provides a reliable platform as it ensures minimal data loss.
- 3. Cost efficient research: The intelligent cloud takes care of the data analysis and prediction from the data stored by use of its machine learning technologies. Also, the cloud provides low cost platform resulting in great research results at a fairly cheaper price.
- 4. Dynamic Data: Today, the hospitals or the healthcare applications use real time data to monitor patient's information and provide solutions. The intelligent healthcare system can help deal with such kind of data.
- 5. Communication: Healthcare industry consists of hospitals, universities, insurance companies, pharmaceutical companies, and research companies. With intelligent systems there is effective and faster communication between such modules in the industry.
- 6. Security: The data has to be secure which is ensured by the cloud computing platform. Also, the interpretable machine learning technology ensures safe use of such data and efficient as well as correct predictions or solutions.
Even though there are numerous advantages and use of such a system, there are various challenges as well in creating such a system. There has been a lot of work done in combining Al with cloud and creating a powerful system to be used across fields including health industry. But, segregating the right data, understanding the model that is giving us solutions and choosing the right solution requires a lot of time and effort. Interpretable machine learning is comparatively very new even though very useful. There are not enough models or enough work done in healthcare that can help us create a better model or trust one for the crucial problems existing in healthcare. As mentioned in , the future of inter- pretability in healthcare, there are still a large number of questions unaddressed in the area of interpretable models; but looking at the various solutions being discussed, it is a great area of research and can have unimaginable benefits in this industry.
Early detection and prediction of brain tumor using Intel ligent Cloud
There are various kinds of machine learning as well as deep learning models that can be used for the classification of brain tumors. Naive Bayes classifier is one of the simple machine learning models as proposed by authors of  for classifying any set of data into specific types of known classes. It is one of the base models that we have used for classification to understand how accurately the model performs and if it can be relied on for such classification tasks. Convolution neural networks (CNN) have a very large learning capacity which make strong and mostly correct assumptions about the nature of images. It is also well known that the success of these networks largely depends on how much bigger a dataset is and how well trained the network is. But, the authors of  show that contrary to the belief that learning is necessary for building good image priors, a great deal of image statistics are captured by the structure of a convolutional image generator independent of learning. This motivated us
Cloud Computing Based Intelligent Healthcare System
to propose an interpretable CNN model for brain tumor classification to achieve effective classification.
An interpretable CNN model helps us in building a trust in the model. We achieve interpretabiIity by inverting the inner layers of CNN and trying to restore back the original image with the tumor region as proposed by authors of [241 ]. This method allows us to analyze the features learned in the inner layer of the network and understand why a specific classification is made. Instead of the network depending totally on data for its training, we try to understand the mechanism behind the layers of the network architecture. By using the above mentioned approaches, we perform classification and compare the efficiency of each model. We also try to understand the advantages of using an interpretable model structure instead of a black box model.
Classification using different models
For better results and simpler training, Convolutional Neural Networks (CNN) models can be used for the classification task. A typical CNN model consists of an input layer, multiple hidden layers and an output layer. The hidden layers of a CNN typically consist of convolutional layers, pooling layers, fully connected layers and normalization layers. These layers are responsible for learning features of the image and use these characteristics for proper classification. Various authors have proposed interpretable CNN models as mentioned in ; more details of the architecture used are discussed in the experiment section.
Python can be used to implement the above described models. A Python package, Keras, makes implementation of all the layers involved in the twin architecture fairly simple and easy. Keras is a high-level neural network API, written in Python with tensorflow base for this project.
The classification model used in Medical Science has to be trustworthy as the data is highly sensitive. To develop the trust in the model Deep learning model is interpreted, i.e., what the inner layers are learning about the input image is represented. This section introduces the algorithm to compute an approximate inverse of the input image. This is formulated as the problem of finding an image representation best matches the representation of input. Formally, given a representation function Ф : RHxWxC —» Rd and a representation Ф0 = Ф(хО) to be inverted, reconstruction finds the image x e RHxWxC that minimizes the objective:
where l is loss that compares Ф(.г) to the target one Ф0 and IR is the regulariser capturing the natural image prior. The output of the above equation results in an image whose representation resembles the representation of input image. There may not be a unique solution to the problem and a sample space of possible reconstructions characterizes the space of images that the representation deems to be equivalent, thus revealing its invariances.
Loss Function. There are many possible choices of loss functions / but for our algorithm, we use Euclidean distance which can be calculated as per Eq. (10.2).
In this implementation, a normalized version of loss is used. Loss is divided by ||Фо||2 so that the dynamic range of loss is fixed and can be contained in [0,1) interval, touching zero at the optimum.
Regularisers. Discriminatively trained representations may discard some low level features as these are usually not as interesting as high level tasks. But as these low level features are useful for visualization, they can be partially recovered by restricting the inversion to a subset of natural images X. This restriction requires modelling the set X and as a proxy appropriate image prior can be in the reconstruction. In the paper, two such image priors are used. The first one is simply the a- norm Ra(x) = |x||", where x is a vector. Divergence can be avoided by choosing a large value of a so that range of images stays within the target interval. The second regulariser is Total Variation (TV) R.j>(x), encouraging images to consist of piecewise constant patches. The formula for TV norm used is as depicted in Eq. (10.3).
To make the dynamic range of the regulariser(s) comparable, one can aim for solution x' to have unitary form. This requirement is considered by objective || Ф(стх) - Фо||2/||Фо||2 +R(x) where the scaling a is the average Euclidean norm of natural images in a training set. Also, the multiplier Aa of the a- norm regulariser should be selected to encourage the reconstructed image cj to be contained in a natural range [-B, B] (for our implementation B=128). The final form of objective function is expressed as per Eq. (10.4).
Experiments and Results
The brain tumor dataset used in this project contains 3064 T1-weighted contrast-enhanced images from 233 patients with three kinds of brain tumor: meningioma (708 slices), glioma (1426 slices), and pituitary tumor (930 slices). Figures 10.1-10.6 shows images of three kinds of brain and tumor shape respectively. The data is organized in matlab® data format (.mat file).
Figure 10.1: Type 1: Meningioma.
Figure 10.2: Type 1: tumor region.
Figure 10.3: Type 2: glioma.
Figure 10.4: Type 2- tumor region.
Figure 10.5: Type 3: Pituitary' tumor
Figure 10.6: Type 3- tumor region.
Each file stores a struct containing the following fields for an image: cjdata.label: 1 for meningioma, 2 for glioma, 3 for pituitary tumor. cjdata.PID: patient ID. cjdata.image: image data.
cjdata.tumorBorder: a vector storing the coordinates of discrete points on tumor border, for example, [x1, y1, x2, y2,...] in which x1, y1 are planar coordinates on tumor border. It was generated by manually delineating the tumor border. So we can use it to generate binary image of tumor mask. cjdata.tumorMask: a binary image with 1s indicating tumor region.
Figure 10.7: Convolutional neural network layers.
Naive Bayes Classifier Model
For processing the data, we have used h5py python library to read the mat files, extract the image column from the 3096 images and store in a dataframe using Pandas Dataframe. We used 70 % of data to train the Naive Bayes model and tested the model on 30 % data. We have used the Gaussian Naive Bayes classification model from sklearn Python library to achieve the classification. We achieved a 89 % accuracy in this model. This model acts as a base model for our experiments and helps us classify the tumors using a simple, yet a learning model that uses available data.
We chose CNN model for its ease of training. Initially we started with just 2 convolutional layers and 1 dense layer with softmax activation function only to get 62.6 % accuracy. We made our model more efficient by increasing the inner layers. We have increased the number of convolutional layers, introduced pooling layers, added dropout and used 2 dense layers for the improved CNN architecture; the final architecture diagram is shown in the Figure 10.7. We achieved 73 % accuracy with the model.
We used softmax activation function so that we get the output as one of the 3 tumor classes. We use the category cross entropy as our loss function as we need a multiclass classification with probability between [0,1]. It helps us measure the performance of the model by comparing the predicted probability and actual labels.
For increasing the efficiency of the model, we used К fold cross validation method. The process included shuffling the dataset randomly, splitting the dataset into к groups. For each unique group, we took that group as a hold out or test data set and took the remaining groups as a training data set, then fitted a model on the training set and evaluated it on the test set. We then retained the evaluation score and summarized the skill of the model using the sample of model evaluation scores. We used 3 folds and achieved acc: 92.84 %, acc: 94.22 %, and acc: 97.06 % test accuracy in the 3 folds, respectively, with average accuracy 94.71 % (+/- 1.76 % standard deviation).
This helped us achieve a fair amount of accuracy in our classification method but for identifying back the tumor region, we needed a model that can predict the region or construct back the tumor images. We wanted to interpret the inner layers of CNN and the next method describes the method we used to achieve the same. The CNN model that we specifically used for this project can be found below. For better understanding of each layer, please refer to any convolutional neural networks journal as I have not talked about them here.
After training the CNN model and validating its accuracy for classification, the weights of the model with the highest accuracy are saved. For image inversion the weights of the saved CNN classification model is used. We create a network with the same 2D convolution layers, activation layers and max pooling layers. The network does not contain dense and dropout layers. Also no batch normalization is needed in the network.
Figure 10.8: Images of features in last convolutional layer at every 100th iteration for Type-1 tumor.
In image inversion, we interpret what the inner layers of CNN are learning. In our implementation, we are representing what is being learnt by the last convolutional layer (seventh layer of our seven convolutional layer network) which has 128 filters of kernel_size 2.
The images in Brain Tumor dataset are two dimensional. For our implementation, we reshaped the images to have four dimensions (batch_size, height, width, depth). Since a single image is given as input, the batch_size is 1. The input images are grayscale so depth of the image is 1. The width and height of the input image is 512.
The loss calculated in the network is the sum of Euclidean distance loss and total variation loss. The network uses Adam optimizer to train the network and minimize the loss. The network takes the input image and generates output at every 100th iteration with first generated image showing what last layer learnt when network is initialized with the weights of the classification model and raw image is given as input. For later iteration, the output of the previous iteration is used as input in the network. Figure 10.8 shows output at every 100th iteration depicting the minimization of loss in the process.
Summary and Discussion
This section proposed an interpretable convolution neural network model for classifying into various types of brain tumors and to invert shallow and deep representations in the inner layers of the model so that brain tumor regions can be constructed back for proper analysis and diagnosis even at an earlier stage. The image inversion method used with a simple classifying CNN gives us information learnt at each layer. This information helped us in reconstruction of the affected area which will help in earlier detection as well as prediction of a possible brain tumor occurrence. There are other CNN models which can be referred and enhanced to find better results. Such models can be found mentioned in [333, 402].
Research Challenges and possible solutions
While above solutions sound good in theory, it is extremely difficult to try and understand how various highly efficient existing machine learning models work under the hood. The complexity is added by the unknown as well. The introduction of interpretation in machine learning models is still very theoretical and have not seen much advancements in practice. Defining boundaries or scope to the interpretability in a model is also very difficult. Considering the industry in which this system is being considered, i.e., healthcare, the rise of cloud computing is huge but reliability on the other advanced technology, i.e., machine learning, is comparatively less and very complex. Only more research work and findings by using the already existing model can lead us to a more explainable model which in turn can utilize the existing cloud infrastructure and result in effective predictability.
Apart from the above challenge, most models are often biased to the type of data which are being learnt by the model. So learning the outputs of a particular model in various iterations or by studying the predictions again and again doesn't help in accurate interpretability. The interpretation to be really accurate and trustworthy might require identifying the blind spots in the processes and maybe segregating the right training data from the otherwise used data. In some cases, understanding the features which are taken into account in each layer in a deep learning model might also give a good insight on the interpretability. There are various ways in which it can be achieved and identifying that according to the problem we have and the solution we are trying to get is a major challenge as well.
The scope for research in creating an intelligent cloud system with interpretable machine learning is huge and with the growing needs and usage of these technologies in the healthcare industry, it is becoming the most efficient solution to cater such needs. As we could see in the previous sections, innovative as well as efficient solutions can now be achieved with such systems. Along with the scalable and reliable cloud infrastructure provided to process such huge amounts of data, these data can also be utilized towards innovations and improved solutions upgraded with new capabilities. The more we analyze the patterns that indicate beginning of a chronic disease, the more we predict more correctly such deadly diseases and provide early diagnosis to humans. More research and methodologies for understanding the interpretable methods and classification can be found discussed in [189, 94, 78, 228]. Thus, we can say collectively these modern technologies can bring in solutions we have not achieved until now.
I would like to thank Prof. Mingchen Gao, Bhumika Khatwani and other anonymous reviewers for the helpful and constructive discussions and feedback that helped in improving the quality of the chapter.