Big Data Management Systems in Healthcare

Due to the large volume of data, health systems will have to parse and explore, and will have to rely on distributed frameworks to divide and analyse the large amount of data. Open-source projects like Hadoop and Apache Spark can be run on the cloud and provide a variety of analytics that healthcare systems can use [8, 9]. These open- source projects empower health systems to analyse large data sets that would otherwise be impossible to analyse. The data also has to be aggregated in a location where we can analyse it effectively and efficiently. Databases that are part of NoSQL technology packages, such as CouchDB and MongoDB, offer solutions to pool the raw data where it can be loaded and analysed by Hadoop, which has the ability to handle huge amounts of data with a variety of structures or no structure at all. There are multiple vendors such as Amazon Web Services (AWS), Cloudera, and MapR technologies that distribute Hadoop platforms [10, 11 ]. The alternative, Apache Spark, is supported by SQL and uses in-memory processing of data which makes it much faster than Hadoop but comes at a cost for large data sets. Apache Spark can be used for real-time data solutions that healthcare systems require [12].

Healthcare data also consists of signals from data such as electrocardiograms, images, and video that is stored in a patient’s electronic health records. The combination of images, videos, and structured texts stored in a patient’s health records can be harnessed using artificial intelligence (AI). AI programs can draw actionable insights from the wealth of structured and unstructured data to make informed decisions and diagnose diseases. Healthcare professionals can look at the abnormalities provided by these machine learning approaches.

Image analytics is important in the healthcare context, where the abundance of procedures including CT, MRI. X-ray, molecular imaging, ultrasound, PET, EEG, and mammograms offers a huge volume of imaging data of large sizes. Radiologists and doctors do an excellent job of manually analysing and finding abnormalities. However, many rare and undiscovered diseases can make diagnosis a challenge. To help in such situations, machine learning can be used to recognize disease patterns from the large data sets amassed over the years. There are also pre-built classification libraries that have already analysed millions of pieces of labelled image data. These can assist doctors and healthcare professionals to diagnose patients correctly without the need for healthcare training.

Healthcare systems can also collect data from large technological companies such as Google and Apple. With the latest trend in wearable devices, patients want to collect as much data as possible regarding their health. These wearable devices can have additional attachments for diabetes or cancer patients to record additional data to promote their health and wellness. Google and Apple provide developer kits that can allow health systems to tap into these data stores and keep doctors connected with their patients. The combination of data from wearable devices and existing patient health records can provide additional insight and personalized healthcare solutions for a patient. Doctors can better manage patient conditions with wearable sensors, track conditions, and offer individualized treatment.

Health systems can also purchase a pre-built commercial platform that is ready to use and more user-friendly compared to an open-source custom-built solution for healthcare system. A powerful and well-known platform is IBM Watson, which is commercial software that is utilized for exchanging and investigating data between various hospitals, providers, and researchers. This commercial platform has the ability to extract maximum information from minimal input. Healthcare systems can easily set up these commercial out-of-the-box solutions and have continuous support from the company to troubleshoot any issues. These platforms are also validated and regulated for commercial release. Healthcare systems will have to pay the price if they choose to take this route but will have the benefits of a successful working analytics product.

The purpose of healthcare systems when adopting big data solutions is to make the patients’ lives easier and healthcare more efficient. To achieve this, the health system will have the option to purchase a pre-built commercial solution where data can be stored and analysed. In order to get actionable insights for the benefits of the healthcare system and patients' health, monitoring starts from scratch and creates a custom-built solution that can be built and improved upon for years to come.

The custom-built solution will have to solve the volume, velocity, variety, and veracity problems the health system is facing. Starting with volume, the healthcare system will have to look into a storage solution. Many organizations prefer to keep data storage on premises to keep control over the data and maximize the uptime. However, on-site server networks can be costly to scale and retain over a period, especially as data grows at an exponential rate. Reductions in cost and enhancing reliability make cloud-based storage an appealing solution for data. Healthcare companies ideally should have a hybrid solution, which is the most flexible and usable approach for multiple data access and storage needs.

Once the storage systems are up and running, the foundational layer of data analytics is laid down. The next steps would be compiling the variety of data sources to be analysed. Healthcare data can come in many ways and there are multiple ways to store and collect the data, such as NoSQL database, mongoDB, or couchDB. Healthcare systems can also have the opportunity to deploy application programming interfaces (APIs) to transmit data, especially when data is housed in different locations dependent on where testing is completed. Once the data is compiled, we can use multiple choices of big data platforms and tools to begin performing big data analytics. The popular favourites are Hadoop and MapReduce to parse through large amounts of data and deliver queries and reports [13].

Big data algorithms can be applied in real-time for data pre-processing purposes. Healthcare systems should be able to develop a data pipeline to store large data collected from imaging such as MRI and CT scans in a format suitable for big data platforms such as Spark. This data conversion process saves time for downstream data cleaning and allows health systems to analyse large amounts of imaging data for quicker analysis. Other machine learning algorithms can be applied at this stage to parse through unstructured data to find patterns and anomalies in patients’ health data. These algorithms can be trained and deployed to prevent disastrous disease outcomes for patients and assist doctors to deliver better care.

Using a custom-built solution for healthcare and deploying multiple open-source products come with a security risk. Healthcare systems will have to invest against malicious attacks intended to steal patients’ data. Security systems will have to be fortified and follow the HIPAA security rules to store, transmit, and authenticate data and control access for specific individuals. Common security measures like using up-to-date antivirus software, firewalls, encrypting sensitive data, and multifactor authentication will have to be employed. Proper architectural solutions will be recommended, especially when handling patients’ sensitive data, such as social security numbers, credit card information from billing, and test data. Such data will have to be encrypted and data access restricted to a limited number of employees. Machine learning algorithms can also be deployed to detect any network anomalies and alert hospital systems if the network is facing any type of cyber security attack with countermeasures deployed to prevent any access to patients’ data.

< Prev   CONTENTS   Source   Next >