Preface

It gives us immense pleasure to introduce to you the first edition of the book entitled Cognitive Computing Systems: Applications and Technological Advancements. Cognitive computing systems are one of the emerging areas under the umbrella of artificial intelligence. The cognitive computing systems mimic the working and learning mechanism of human brains. The major challenge for cognitive systems is to devise models that are self-learning and can infer useful information from extensive ambiguous data. Many techniques are pivotal to build a successful cognition system, including natural language processing, speech recognition, neural networks, deep learning, sentiment analysis, etc. The objective of this book is to introduce cognitive computing systems, bring out then key applications, discuss different areas and technologies used in cognitive systems, and study existing works, underlying models, and architectures. Some unique features of this book that make it a useful resource for learning are the following:

  • • Simple and articulate language
  • • Supported with large examples
  • • Inclusion of recent and up to date work
  • • Discussion on open issues
  • • Present various scientific works on real-world application.

Each chapter presents the use of cognitive computing and machine learning in some application areas. This book will be asset for researchers and faculties to know in detail the fine print of cognitive computing. The work in this book has been contributed by well-qualified and eminent faculties. The authors are keen researchers in the area of artificial intelligence, cognitive computing, and allied areas.

All editors of this book are experts in this domain, each having excellent publications, research experience, and years of experience. A brief profile of the editors is listed as follows. Dr. Vishal Jain is an Associate Professor at Department of Computer Science and Engineering, School of Engineering and Technology, Sharda University, Greater Noida, UP, India. Dr. Akash Tayal is an Associate Professor with the Department of Electronics and Communication Engineering, Indira Gandhi Delhi Technical for Women, Delhi, India. Dr. Jaspreet Singh is an Associate Professor with the Department of Computer Science and Engineering, School of Engineering, G. D. Goenka University, Gurgaon, India, and Dr. Arun Solanki is an Assistant Professor with Gautam Buddha University, Greater Noida, India.

The idea behind this book is to simplify the journey of aspiring data scientists and machine learning enthusiasts across the world. Through this book, they will enable us to work on machine learning problems and gain from experience. This book will provide a high-level understanding of various machine learning algorithms along with tools and techniques.

This book would not have been possible without the motivation and valuable contribution of many people. First of all, we would like to thank the almighty God for providing the inspiration and the motivation to keep doing good work. We are gratefiil to our mentors, who are our inspirational models. This book would not have been possible without the support of our families, including the kids, who cooperated on many counts including sparing us the precious time to complete this book effectively, from their schedule of already tight periods, that otherwise would have been spent on their upbringing.

We hope that the readers make the most of this volume and enjoy reading this book. Suggestions and feedback are always welcome.

PARTI

I: Using Assistive Learning to Solve Computationally Intense Problems

High-Frequency Stochastic Data Analysis Using a Machine Learning Framework: A Comparative Study

LOKESH KUMAR SHRIVASTAV1 and RAVINDER KUMAR2

University School of Information, Communication and Technology,

Guru Gobind Singh Indraprastha University, New Delhi, Delhi 110078, India

2Skill Faculty of Engineering and Technology, Shri Vishwakarma Skill University, Gurugram, Haryana 122003, India

'Corresponding author. E-mail: This email address is being protected from spam bots, you need Javascript enabled to view it

ABSTRACT

High-frequency stochastic data analysis and prediction are challenging and exciting problems if we aim to maintain high level of accuracy. The stock market dataset is selected randomly for the experimental investigation of the study. Historical datasets of a few stock markets have been collected and used for this purpose. The model is trained, and the results are compared with the real data. In the past few years, specialists have tried to build computationally efficient techniques and algorithms, which predict and capture the nature of the stock market accurately. This chapter presents a comparative analysis of the literature on applications of machine learning tools on the financial market dataset. This chapter provides a comparative and brief study of some relevant existing tools and techniques used hi financial market analysis. The main objective of this chapter is to provide a comparative study of novel and appropriate methods of stock market prediction. A brief explanation of advanced and recent tools and techniques available for the analysis and a generalized and fundamental model hi R-language for the stock market analysis and prediction are also provided in this chapter. In addition, this chapter presents a review of significant challenges and futuristic challenges of the field.

INTRODUCTION

In recent years, high-speed data acquisition technology has demanded an appropriate and advanced analysis and prediction mechanism. With easy access and advancement of the storage system, a massive amount of data are generated every second. It forced the demand for a high-speed data analytics processing system. The development of information communication technology and complex computational algorithms and the collection, analysis, and prediction using high frequency are possible [1]. In recent years, machine learning frameworks play an important role in providing an excellent forecast and fast computation on a massive amount of high- frequency data. High-frequency data mean that datasets are collected in a fine regular or irregular interval of time and are referred to as high- frequency stochastic time-series datasets [2]. Problem-solving technology may be classified into two parts: hard computational method and soft computational method. The hard computational method can deal with precise models to achieve the solution accurately and quickly. Prof. L. Zadeh introduced the soft computational method in 1994. It is a mechanism that deals with uncertainty, robustness, and better tolerance objectives of the model. The soft computing method is a hybrid mechanism that is a combination of fuzzy logic, evolutionary computation, probabilistic reasoning, and machine learning. The fundamental constituent of soft computing is machine learning that was introduced by A. Samuel in 1959, which explores the study and development of appropriate technologies and tools that can learn and adopt the nature from the data and predict its futuristic nature. The application of machine learning has gradually grown and now' reached on its optimum and very advanced level, where it can examine and forecast the futuristic nature of the dataset. In the past few' decades, machine learning has produced a compelling result in low'-frequency datasets. A lowr-frequency dataset means a dataset that will be collected in an hour or a day. Nowadays, we are living in the era of nanotechnology, w'here data can be produced in the fine interval of minutes or second or nanoseconds. These kinds of datasets are beneficial for machine learning tools and techniques, where chances of deep learning can increase more, and the prediction capability will produce a better result. The uses of machine learning techniques over high-frequency data are applied and used from social networks to medical science and nanoscience to rocket science. The existing franrew'orks for analysis and prediction of these high-frequency datasets can be classified into the following two categories.

  • Statistical model: This is a traditional mathematical method, in which advanced statistical models and procedures can analyze the high- frequency dataset. It is good to predict in terms of stock market, but it demands some assumption that decides the accuracy of the model [3]. Therefore, it cannot be utilized as an intelligent system.
  • Soft-computing-based model: Machine learning frameworks are compelling and are capable of capturing the dynamics of stock price and predicting the futuristic trend by using the time-series mechanism.

Recent research has shown that both models have then own significance to capture the high-frequency datasets [4,5]. Analysis of the stock price trend has been challenging for both investors and researchers. Financial time series is a significant source of information for stock market prediction. Finding hidden patterns is the requirement of analysis and prediction for price fluctuations. The fast development of computing capacity, as well as modem and smart machine learning frameworks, makes it possible to produce new solutions for data analysis. Because of the slow and complicated process of manual and fundamental statistical methods, its prediction has less use because a model has no use if streaming of the dataset is faster than the production of its result. A process that demands longer time to arrive for the forecast has no use if the objective is to predict the result in time. In this work, we will tiy to produce the generalized machine learning model to make an intelligent system, which will capable of producing the futuristic nature of the stock market.

The last decade was the era of the single model, where one model is taken for the analysis and prediction. The experimental results proved that these models are good, but for the betterment of the model, a hybrid model was applied and analyzed. It is found that the performance estimation of the hybrid framework is better than that of a single framework. However, much space is available to produce a better model and the best model [6]. In this study, we have taken stock market datasets that are collected on the fine interval of the single minute, so these are high-frequency and nonlinear datasets. The availability of the high-frequency dataset is infrequent in the field of stock market. Therefore, it was rarely utilized in the recently reviewed experimental setup, hi this study, a base, generalized, and very famous statistical model, autoregressive integrated moving average (ARIMA) model, will be applied on this high-frequency dataset of the stock market on the tune of the experiment, and it will be compared with a fundamental machine learning model generalized linear model (GLM) to define these two models veiy carefully. The GLM is a basic regression model of machine learning with multidistribution capabilities. If the ARIMA model will able to compete for the GLM, then it will be compared with other advanced machine learning models in a later stage of this study. The ARIMA model is the fundamental and open model that gives liberty to its developer to understand the nature of the datasets at each stage of its processing. This is the reason why it is called a base model. This model carries its own simplicity to understand the nature of nonlinear datasets and saturates it, as desired. This approach can be fruitful to guide the student and investors to build a more reliable and optimized intelligent financial forecasting system. The primary use of this work is to explore and provide fundamental obstacles and futuristic dimension and guidelines in directions of particular research. These simulated results have shown deviation from the actual result. Therefore, a deeper comparative ensemble and advanced analysis and simulation are required to build a more optimized intelligent system to predict the stock market behaviors more precisely and correctly. The machine learning framework is therefore required for a more optimized result. For this purpose, we will compare this result with the artificial neural network (ANN) model or a twin support vector machine (TWSVM) or more complicated support vector machine (SYM) by using different hybrid kernels to precise the behavior of high-frequency datasets in a better maimer.

The rest of this chapter is organized as follows. Section 1.2 presents a detailed review of recent literature on prediction. Section 1.3 presents the shortcoming of various prediction methods. Section 1.4 presents the proposed prediction method. In Section 1.5, detailed results and the experimental setup are presented. Section 1.6 presents the conclusion and future direction.

 
Source
< Prev   CONTENTS   Source   Next >