IoT in Bioinformatics and Biotechnology
With the large volume of data available today, it is difficult to find relevant content. The IoT consists of a world of physical objects embedded with sensors and actuators connected by wireless networks and communicating using the Internet, shaping a network of intelligent objects, with processing capacity and capable of capturing environmental variables and reacting to external stimuli. These objects are connected and can be controlled over the Internet, enabling a multitude of new applications .
The number of devices connected to the IoT increases every day. The advancement of technology entering our daily lives is increasing, contributing to the flexibility and mobility of the most diverse tasks: security, home automation, industry 4.0, hospital automation, and medical services, among others. This is the concept of IoT in medicine where all computerized devices can interconnect, using the Internet .
Although the Web has been used extensively in the last decade, new technological standards have emerged as web services, the composition of these web services implies a new form of data management, concerning data in XML format, new management techniques, transactions, and XML query processing. What it implies from the point of view of information systems, considering the cheapening of computer systems, combined with software models for sharing distributed memory, is that the storage and manipulation of medical data in hospitals and clinics with parallel processing is facilitated, therefore providing environments that integrate varied computational resources, which are managed by different organizations and geographically distributed, bringing an innovative need in the management of this data .
IoT is one of the main technologies to allow the creation of cyber-physical systems and to realize intelligent vision from this scenario. Several recent technological advances have allowed the emergence of IoT applied to bioinformatics, such as nanotechnology, wireless sensor networks, mobile communication, and ubiquitous computing. However, there are still technical challenges to be overcome to fully realize the IoT paradigm .
In this sense, bioinformatics applications typically manage a large volume of data, which are multidimensional, dynamic, with different levels of complexity, and coming from several heterogeneous sources such as sensory data, protein and gene sequencing, and image digitization, among many others. The technological challenges are related to the design of solutions, such as middleware platforms, dealing with this enormous heterogeneity resulting from the diversity of hardware, sensors and actuators, and wireless technologies inherent to IoT bioinformatics .
The development of IoT applications in health considers the scale and heterogeneity of medical and biomedical devices in the healthcare environment. The processing and storage of the huge amount of data generated, often in the form of streams that require online computing, as also the management of the resources, generally heterogeneous, is necessary to handle with this data to provide answers and information of added value and on time for bioinformatics applications. Such characteristics demand the use of advanced technology for adequate management and satisfactory performance in the manipulation of the stored data, extraction, and management of knowledge from that data [3,20-22].
Bioinformatics is one of the results generated from that of Watson and Crick in 1953, which revealed that the DNA is structured as a double helix. The researchers could not imagine the volume of information that would exponentially be generated from that moment on. In the decades that followed the work of Watson and Crick, computational tools, which were initially quite simplified, began to be developed and made it possible to analyze and resolve numerous questions related to the structure of DNA, as well as the genetic information that encodes proteins, their structural properties, and the factors that regulate them, as well as the events associated with genetic regulation, with the molecular bases of embryonic development and with the evolution of metabolic and biochemical pathways. Simply put, bioinformatics is the union of computer techniques with molecular biology, that is, it is an area that needs professionals linked with multiplicity [23-25].
Since the 1960s, the growth in the number of known amino-acid sequences has led to the pioneering application of computers in molecular biology. From that period, the amount of data that should be analyzed has grown considerably, and the more affordable prices of computers have made it possible to introduce them into academic settings [23-25].
Margaret Dayhoff developed the first programs to determine the amino-acid sequence of a protein in 1965 and prepared the first protein sequence database that evolved into the PIR (protein information resource) in 1983. The sequence comparison and phylogenetic analysis were the first advances in the field of bioinformatics in the 1960s. Later, in the 1970s, structural analyses of macromolecules began. However, these analyses were quite limited due to the computing capacity available at that time. In that same period, computer methods also began to be applied in the processing of information about nucleic acids. Programs to compare sequences began to be developed. FASTA was developed around 1985, Genbank in the early 1980s, and SwissProt around 1987. In the late 1980s, the term bioinformatics started to be used for the science that integrated information technology and biology [26,27].
Even in the late 1980s, more advanced bioinformatics programs were developed in academic centers and quickly became commercial products, being distributed as integrated tool packages for the administration of molecular biology data. The improvement in computer systems allowed a great advance in automatic learning techniques with clear applicability in the field of bioinformatics. In the late 1990s, the demand for bioinformatics specialists was remarkable; however, few universities offered educational programs on this topic, which have grown considerably in recent years [23,24].
In 1999, Brazilian science stood out internationally with the complete DNA sequencing of the bacterium Xylella fastidiosa. This work relied on the use of genetic sequencing software based on the Internet, corresponding to the beginning of bioinformatics in Brazil. Currently, several sequencing projects are underway in our country, such as brGene, OMM, PIGS, Leifsoniaxyli, the genome of coffee, banana, and RioGene, among others [23,24].
In 2002, the first specialization course in bioinformatics was implemented in Brazil by the LNCC (National Laboratory for Scientific Computing),and in that same year, two strict sense postgraduate courses in bioinformatics were also authorized, one at USP and the other at UFMG, which are currently in full operation. Because of all that has been exposed, it’s seen that bioinformatics is growing considerably, and it will be increasingly necessary for the interpretation of data in molecular biology. Sophisticated molecular techniques such as microarray, and new generation sequencing, among others, confirm the decisive role of bioinformatics in understanding the billions of data generated by these innovative tools [23,24].
Actually, in the world, bioinformatics has acquired increasing importance in the manipulation of biological data. Through the combination of procedures and techniques, it helps biologists with the complexity of both hardware and software tools, optimizing workflow in a distributed environment and reducing the overhead of data movement between programs. Thus, bioinformatics is a convergence of current technologies, which are in different stages of development, with the possibilities we still do not know what they will be. Since it is seen an extreme convergence between technological, physical, biological, social, cultural, and environmental means, and the transition from the digital revolution to a new industrial revolution, as a consequence of the computational influence, more and more accentuated and present, understanding the importance biotechnology and bioinformatics in this industrial evolutionary process, indicating an improvement in people’s quality of life [3,28].
Bioinformatics is a knowledge to be applied in the field of biology. When it started to generate a large volume of sequencing data, for example, it was necessary to recruit computer scientists, statisticians, and mathematicians to develop software and tools to assist in these analyses. It seeks the integration of scientific knowledge with computational algorithms, for the generation of specific knowledge, contributing to new medical practices, facilitating the diagnosis, and indicating the best way to treat the individual patient [28,29].
Today there is research that generates data that cannot be analyzed without a computer: the Big Data analysis to reveal everything at the same time, not just one or two parts of a cell. Big Data is the term used to refer to the huge amount of structured and unstructured data generated every second. The keyword of bioinformatics is data integration, that is, huge amounts of data, developing methods that allow organizing and mining these vast amounts of data. Currently, a lot of biological, biochemical, and biophysical data have been produced in research, so the big idea that has to do with this is to be able to integrate and transform all this heterogeneous information into a result that can be interpreted and understood. The objective is to study and relate all this data to advance the research [30,31].
In the early 1970s, traditional biotechnology was a discipline confined to chemical engineering departments and microbiology programs. In 1971, the term “biotechnology” was coined by Hungarian engineer Karl Ereky. This researcher used the term to describe his experiment that aimed at the large-scale production of pigs fed with beets grown with microorganisms. Karl then defined biotechnology as all lines of research that involve the generation of products from raw materials that received the addition of living materials. However, this terminology remained quite ambiguous among scientists, and only in 1961, the term was then associated with the study of the industrial production of goods and services by procedures using biological organisms, systems, or processes. This is due to the Swedish microbiologist Carl GorenHeden, who suggested changing the title of a scientific publication journal in the field of applied microbiology and industrial fermentation titled Journal of Microbiology and Biochemical Engineering and Technology to Biotechnology and Bioengineering .
Traditional biotechnology focuses on three main aspects: (i) preparation of the raw material to be used as a source for microorganisms; (ii) the fermentation process of the material in bioreactors, obtaining the biotransformation and production of the desired material; and (iii) the purification of the final product. Obtaining a particular product on industrial scales is the main objective of biotechnology. So, a lot of research was done to improve the three aspects involved in the development of technology. For this purpose, several investments were made in the design of new bioreactors and the control and monitoring of fermentation processes. Despite a significant increase in production, the optimization of the biotransformation process remained below the desired level .
The strains of microorganisms capable of synthesizing the products of interest did so at suboptimal levels for an industrial scale. Random mutations induced by chemical mutagens and ultraviolet radiation were sometimes able to increase production levels. However, this scope is often limited. Usually, mutation induction affects not only the desired trait but other important ones for cellular metabolism .
Bioinformatics is an area of biotechnology that corresponds to the application of computational techniques to understand biological behavior in complex samples, considering advanced studies in cancer and chronic diseases using the tools available in computing, IoT devices, and new disruptive technologies, such as artificial intelligence and machine learning, in short cognitive computing. This results in more complex systems, complex research, transdisciplinary and multifunctionality, ranging from understanding gene alterations, as well as large-scale protein variations, impossible to be analyzed individually by manual techniques to the biological variability that exists between patients who are diagnosed with the same disease and can be measured and clarified with computer models, thus contributing to the development of targeted therapies, improving the prognosis of diseases that are difficult to diagnose and treat .
Bioinformatics is, therefore, an area of knowledge applicable in several institutions, such as clinical analysis laboratories, biotechnology, biochemistry or pharmaceutical companies, and also hospitals. It is an area of knowledge and not simply a platform for technological solutions. Its importance grows along with the increase in large-scale data generation. Finding various applications, including medicine, agronomy, and environmental sciences; development of new techniques and software for biology problems; and mathematical modeling based on networks, the development of programs that make it possible to study changes in molecule structures aims to produce more effective and more cost-effective drugs, which also finds multiple applications in biology and medicine [28,29].
These programs provide a consolidation in several areas of scientific knowledge, such as genetics, molecular biology, cell biology, microbiology, biochemical engineering, biochemistry, bioinformatics, biosafety, bioethics, among other diverse advanced segments. It has a multidisciplinary profile, which is in line with the guidelines of Translational Medicine, adding knowledge from the biosciences, informatics, and exact sciences for management, analysis, and even prognosis of medical and biological data, which are elements or measures collected from biological sources such as DNA, RNA, proteins and enzymes, digital images and other data .
Concerning multidisciplinary, it makes getting practical results faster, whether for clinical, surgical treatments, or technology development in the medical field. Just as also, the necessary structure for the application of translational medicine must meet this new concept of faster transfer of information from the laboratory to clinical practice. Companies must be aligned with academic research. These, in turn, can go from being merely theoretical to being applied, in a two-way street— the company understands what society wants, and the academy studies new technologies—because of the need to approach interdisciplinary areas to develop enabling solutions for the new industry based on engineering and its methods in modern health .