Section 4 Data Handling
Comparison of Machine Learning Algorithms for Predictive Modeling of Beef Attributes Using Rapid Evaporative Ionization Mass Spectrometry (REIMS) Data
Comparison of Machine Learning Algorithms for Predictive Modeling of Beef Attributes Using Rapid Evaporative Ionization Mass Spectrometry (REIMS) Data
Devin A. Gredell, Amelia R. Schroeder, Keith E. Belk, Corey D. Broeckling, Adam L. Heuberger, Soo-Young Kim, D. Andy King, Steven D. Shackelford, Julia L. Sharp, Tommy L. Wheeler, Dale R. Woerner, and Jessica E. Prenni
CONTENTS
- 12.1 Methods 183
- 12.2 Results and Discussion 186
- 12.3 Conclusions 191
Data Availability 191
References 191
Acknowledgments 193
Author Contributions 194
Additional Information 194
Ambient mass spectrometry is an analytical approach that enables the ionization of molecules under open-air conditions with no sample preparation and very fast sampling times. Rapid evaporative ionization mass spectrometry (REIMS) is a relatively new type of ambient mass spectrometry that has demonstrated applications in both human health and food science. Here, we present an evaluation of REIMS as a tool to generate molecular-scale information as an objective measure for the assessment of beef quality attributes. Eight different machine-learning algorithms were compared to generate predictive models using REIMS data to classify beef quality attributes based on the United States
* This chapter was originally published as an Open Access article in Scientific Reports (2019), 9:5721. doi. org/10.1038/s41598-019-40927-6.
Department of Agriculture (USDA) quality grade, production background, breed type, and muscle tenderness. The results revealed that the optimal machine-learning algorithm, as assessed by predictive accuracy, was different depending on the classification problem, suggesting that a “one-size-fits-all” approach to developing predictive models from REIMS data is not appropriate. The highest performing models for each classification achieved prediction accuracies between 81.5 and 99%, indicating the potential of the approach to complement current methods for classifying quality attributes in beef.
In food science, chemical screening of ingredients and finished products is critical to ensure quality for food producers and consumers. Mass spectrometry (MS) is an important chemical detection platform for food analysis; however, methods typically require lengthy and complex sample preparation steps and analysis times.1 Ambient mass spectrometry is a relatively new approach that enables the ionization of molecules under ambient conditions with no sample preparation and very fast sampling times. Takats et al. reported the first ambient ionization approach, desorption electrospray ionization (DESI), in 2004.2 There are now over 30 reported ambient ionization techniques spanning application areas from pharmaceutical analysis to biological imaging to forensics and explosives detection.3 The application of ambient ionization technology for the analysis of food and in particular for the detection of food fraud was recently reviewed by Black et al.1
Rapid evaporative ionization mass spectrometry (REIMS) is an emerging ambient ionization technique that has demonstrated applications in both human medicine and food science.4 For example, REIMS-based tissue analysis can be used for intraoperative analysis of histological tissue for the identification of cancerous tissue margins and other prognostic and diagnostic applications.5-7 While the REIMS technology was initially developed with biomedical applications in mind, it has also proven to be a valuable tool for the analysis of food. Recently, the utilization of REIMS for the analysis of meat products has generated very promising results across various classification scenarios reflective of important quality attributes such as genetic differences, production background, and sensory attributes. Balog et al. (2016) used REIMS to differentiate between various mammalian meat species and beef breeds with 100% and 97% accuracy, respectively.8 Similarly, Black et al. used REIMS to accurately (98.9%) classify several fish species as an approach for detecting food fraud in the seafood industry.9 Guitton et al. (2018) utilized REIMS to detect lipid changes in porcine muscle tissue reflective of treatment with ractopamine, a common growth-promotant used to increase muscle mass in swine.10 Verplanken et al. (2017) successfully segregated pork carcasses with and without boar taint, an important sensory attribute related to pork quality.11 Importantly, these examples illustrate the potential value of using REIMS to generate molecular-scale information as an objective measure for the assessment of meat quality.
To utilize molecular profiles generated by REIMS or any of the ambient ionization techniques as a means to classify samples, one must employ machine-learning algorithms to generate a predictive model. Machine learning is the process of rapidly finding and characterizing patterns in complex data.12 There are many different types of machine learning including, for example, decision tree learning, network analysis, linear regression, support vector machines, and similarity functions.12 Each of these algorithms is based on different mathematical approaches and thus, it is expected that with variation in the types of samples, data, and phenotypes in an experiment, one algorithm can greatly outperform others in terms of prediction accuracy. The majority of REIMS applications have used linear discriminant analysis on principal component analysis (PCA-LDA) reduced data for the generation of predictive models. The PCA-LDA method performs well for the classification of groups that tend to show large differences in the molecular profile of samples, such as the REIMS-meat studies described above. However, when molecular profiles of samples are not overly distinct (e.g. consumer food preference) or classification of multiple groups within a single model is desired, alternative machinelearning approaches may outperform PCA-LDA.13
In this study, the predictive accuracy of eight different machine-learning algorithms were compared for the generation of predictive models using REIMS data to classify attributes of beef based on USDA quality grade, production background, breed type, and muscle tenderness.