# SHORTCOMING OF VARIOUS PREDICTION METHODS

**1.3.1 HYBRID MECHANISM**

The individual model has drawbacks such as local optima, overfitting, and difficulties to pick other parameters that directly decline the accuracy of prediction. These problems can be solved by combining two or more models in an individual approach to get a more precise and accurate result. It is a combination of parametric and nonparametric approaches. First, the parametric approach is applied to model and implement the trend of price movement. Second, a nonparametric approach is applied to the result of the parametric approach to enhance and improve the final performance. The parametric approach is applied in the case of binomial tree, finite-difference method, and Monte Carlo method, whereas the nonparametric method is applied in the linear neural network, multilayer perception (MLP), and SVM. In the next part of our study, some new or advanced machine learning tools (TWSVM) will we applied to get a more precise result.

Al-Hnaity and Abbod [24] suggested a hybrid mechanism to optimize the results. It is a certain linear combination of the models with individual weight factor. To maintain this criterion, a new and modified hybrid model is proposed to address the nonlinear characteristics time-series pattern of the stock market. The proposed model is a set of BPNN, SVR, and SVM, where the GA decides the weight of these models. By the proposed model, another required model can also be combined to optimize the result with the help of GA.

**1.3.2 EXISTING FORECASTING MODELS**

Financial time-series forecasting is the mam challenge of machine learning techniques. Some commonly known soft computing techniques are discussed in the following [25].

*1.3.2.1 REGRESSION ANALYSIS MODEL*

Regression analysis is a very fundamental and generalized technique that has already proved its efficiency in the field of high-frequency data analysis, where problem variables are moving according to the time or time series [26]. Stock market prediction is one example of high-frequency data analysis, where the attributes of the stock market move or fluctuate according to the time series. The ARIMA model has already shown the ideal model to settle a benchmark. The hybrid ARIMA model is a better option that finds a correlation for autocorrelated errors. For the comparative analysis with the nonparametric machine learning framework, this study proposes a GLM.

This model contains a collection of regression techniques such as Gaussian, Poisson, Binomial, Gamma, Ordinal, and Negative regression. All these regression techniques have their own significance so that they can be utilized according to the demand of the problems.

*1.3.2.2 ARTIFICIAL NEURAL NETWORK*

The ANN [27,28] is a fundamental mechanism in the analysis and prediction of high-frequency polynomial datasets due to its capability to adapt discontinuities and nonlinearities. It is an autodriver type of mechanism where nonlinear behavior datasets can be easily analyzed and predicted with the help of a time series. However, the experimental result proves that it cannot handle nonstationary, multidimensional, and excessive noisy datasets. In these conditions, the FFNN can provide a better result than the ANN due to feedback and past simulations. Still, the ANN got popularity because of its simplicity, and it can provide a benchmark for further development, and it is already used in the area of stock market prediction and analysis. It is a mathematical mechanism, which is inspired by the biological neural network present inside the human's brain, and can easily correlate the input stream with the output result. It can handle the uncertainty and irregularities of the datasets.

*1.3.2.3 SUPPORT VECTOR MACHINE*

Vladimir Vapnik introduced it in AT&T Bell Laboratories. The experimental result has already shown that it is a far better mechanism than the ANN and the FFNN [29,30]. It is a competent and powerful mechanism that can solve the problem of classification and regression in terms of both analysis and prediction. It creates some support points that correlate in vector in terms of a two-dimensional system and provide a hyperplane in terms of a three-dimensional environment. It takes training datasets from the classified datasets and represents it into a higher dimensional space in terms of classification and isolates it from the dissimilarities by using linear regression techniques. It utilizes the predetermined kernel function according to the requirement of the datasets, where the optimal hyperplane can be produced, which is called support vector with the maximum margin.

*1.3.2.4 KERNEL FUNCTIONS FOR MACHINE LEARNING FRAMEWORK*

Kernel is a mechanism of finding the similarity index between the two data points. It can be utilized without knowing to feature space of the problem domain because it contains all information about its input datasets as a relative position in feature space. As described by Howley and Madden [31], it is a process that reveals regularities in the dataset in the representation of new feature space that is not detectable in the real representation. The major merit of the method is that it allows the use of a process that is based on linear functions in the processed feature space that is computationally efficient as well as expressible [32]. The kernel trick is a process hi which the no explicit representation can be done in the new feature space. By using polynomial computational cost, it is capable of expressing the feature spaces whose expressibility is greater than the polynomial. This type of combined approach of the kernel can be easily used for the purpose of solving problems of machine learning, such as unsupervised classification (clustering), semisupervised learning, and supervised classification or regression. The SYM classifier is the classic and most eminent example of the classic kernel approach found under the supervised learning umbrella. The SVM is the most famous model where the dataset can be taken in terms of the dot product of pairs to solve the decision function and the optimization problem. Due to this capability, the SVM handles the problem of nonlinearity. By using a kernel function *K(x, :) =* cp(.r), cp(r), where cp(.r) is used as mapping to the particular feahire space and the replacement of the dot product can find it, is utilized to calculate the dot product of the two sampled datasets available in the feature space. The kernel is also utilized to observe the maximum margin and provide the maximum separating hyperplane in the feature space of the SVM that creates the optimal decision boundary. Finally, in the SVM, a kernel function is capable of calculating the optimal separating hyperplane without the explicit use of the mapping function in the feature space.

*1.3.2.5 CHOOSING THE RIGHT KERNEL*

The performance of the SVM depends upon the selection of the kernel, which is a veiy challenging task and depends upon the datasets, problem, and its nature [33]. Manual selection of the kernel is the veiy complex work, which can be resolved by the automatic kernel selection procedure with the prior knowledge of the datasets. The earlier experiment suggests that mixture kernel can enhance the capability of the SYM single kernel. The kernel can be mixed in an inappropriate way to enhance the performance of the system by *K(ix, :) = K _{x}(x, z) +* (.r,

*z)*by the basic kernel building process. It is

always not possible to construct the right mixture of the kernel; therefore, an automatic kernel mixture procedure is required. First, we will tiy to understand the types of the kernel; then, we will understand the fundamental mixture kernel building process.

*1.32.6 TYPES OF KERNEL FUNCTIONS*

The particular list of few popular kernel functions taken from the existing and recent literature works is described as follows.

**1.3.2.6.1 Linear Kernel**

It is the most fundamental and simple kernel function, which can be formed by the inner product *(x,y)* and adding an optional constant *c* in it

**1.3.2.6.2 Polynomial Kernel**

It is a nonstationary kernel function, where fust all the training data will be normalized for the particular set of the problem, in which *x* and *у* are input vectors, *c* is a constant term (adjustable parameter), a is a slope, and *d* is a polynomial degree that can be selected by the users

**1.3.2.6.3 Gaussian Kernel**

This is an essential kernel function that can be applied in different learning algorithms. It is well known for supervised learning classification as an SVM and SVR. It is also known as a radial basis kernel, where *x* is the center and <7 is the radius that can be given by the user. Here, the two samples .r and .r' express feature vectors in the input domain, ||.r-y||^{2} represents squared Euclidian distance between the two features domain, and *a* is a free parameter

**1.3.2.6.4 Analysis of Variance (ANOVA) Kernel**

It is a veiy particular type of kernel, as convolution kernels are ANOVA and Gaussians. To design and develop an ANOVA kernel, we have to consider *X* = S^{N} for some set *S* and kernels *k ^{(i>}* on

*S * S,*where

*i*= 1, . . . ,

*N.*For

*P =*1,...,

*N.*the ANOVA kernel of order

*P*is explained as follows:

**1.3.2.6.5 Hyperbolic Tangent (Sigmoid) Kernel**

It is also a particular type of kernel applied in the field of many gr eat machine learning ANNs, FFNNs, and the rest of the fields where we can use *a* and *r, *where *a >* 0. It is taken as a scaling parameter of the input dataset, and *r* can be used as a shifting parameter, which controls threshold mapping. If *a* < 0, it will reverse and scale the dot product of the input dataset

**1.3.2.6.6 В-Spline (RBF) Kernel**

It is a specialized type of kernel defined in the interval of [-1, 1]. It can be explained with the help of a recursive formula

where *p* e *N* with *B _{M} = B,B_{0}.*

*1.3.2.7 HYBRID KERNEL METHOD*

*A* hybrid or mixture kernel can enhance the capability of analysis and prediction, so it can be used widely. For this purpose, we can take the number of candidate kernels and tiy to implement a method to find the sparse linear mixture of these kernels to enhance its productivity of prediction and analysis of a future dataset. The SVM is a probable example of this type of kernel. By using the particular mixture of the kernel, we can produce the customized SVM.

Two or more than two kernels can be combined by using the simple kernels, which can develop a mixture kernel or new kernel as follows:

Let *K _{{}* and

*К*be kernels over

*XxX,*Jc

*R", aeR*

^{+}, and let/( ) be a realvalued function 011X

where *K _{i}* is a kernel over

*R'"*x

*R'"*and

*В*is a symmetric positive-semidefmite

*>i x >i*matrix. Then, the following functions are kernels:

- •
*K(x,y) = K*_{l}(x,y) + K_{2}(x,y) - •
*K(x,y) = K*_{l}(x,y) + K_{1}(x,y) - •
*K(x, y) = cK*_{{}{x, y) - •
*K(x, у)*=*К*_{ъ}(0(x), 0(y)) - •
*K(x, y) =f(x)*+/(-) - •
*K(x,y) = x'By*

These are the fundamental procedures to develop the mixture kernel, which is also a valid kernel.

Different types of kernels can be used for different subsets of .r, and we can mix the information taken from different sources; each kernel will start analyzing similarity as provided by the domain. If the two inputs from two different sources will be taken, we obtain

where .r = *[x _{A},* .rj is the concatenation of two different representations of the different sources.

This method will provide the linear mixture of kernels. Different types of kernel can be used to predict different types of datasets. However, all kernels cannot produce appropriate results. Therefore, finding appropriate kernels and its appropriate mixture (linear, quadratic, conditional, etc.) is a big challenge.

*1.3.2.8 HYBRID MODELING MECHANISM*

As we discussed earlier, the individual model has own drawbacks such as local optima, overfitting, and difficulties to pick other parameters that directly decline the accuracy of prediction [34,35]. These problems can be solved by combining two or more models in an individual approach to get a more precise and accurate result. It is a combination of parametric and nonparametric approaches. First, the parametric method is applied to design the model and get it trained by its price movement. Second, a nonparametric method is used on the result of the parametric method to enhance and improve the final performance. The parametric approach is applied in the case of binomial tree, finite-difference method, and Monte Carlo method, whereas the nonparametric method is applied in the linear neural network, MLP, and SYM. Different researchers provide hybridization, which is a combination of two or more individual models.

*1.3,2.9 TWIN SUPPORT VECTOR MACHINE*

The TWSVM is an advanced and new kind of framework that gives eminent analysis and prediction results in the case of regression as well as classification [36]. It is based on the mechanics of the generalized eigenvalues proximal SYM and produces the twin nonparallel plane by correlating twin of quadratic programming problems. The experimental results suggest that it is better than the conventional SYM. Earlier, the TWSVM was constructed to solve the problem of binary classification. However, in the later stage of the modification, it was started to apply on multiclass classification problems. Due to promising and empirical result, the application of TWSVM has gradually increased and provides better analysis and prediction capability in the field of all types of classification and regression problems.

*1.3.2.10 FINANCIAL TRADING FRAMEWORK WITH FORECASTING*

Cavalcante et al. [37] gave a model to forecast the stock market by using a machine learning framework. Producing a computational intelligent system that can be applied in time-series forecasting demands the following functional steps: data preparation, algorithm definition, training, and forecasting evaluation. In the first step of data preparation, all abnormalities and noise should be detected and normalized before further processing. At the next stage, it requires algorithm definition and determination of soft tools and techniques, which can be used to model and forecast the data where the architectural method can be defined and selected. In order to create a model, it requires selecting an appropriate training process or method with the adjustment of training parameters and training procedure. In the last phase, the evaluation metrics analysis and accuracy measurement results can be obtained by simulation of the trained dataset. Two additional phases are required to be added for the betterment of the intelligent prediction system: the trading strategy and the money evaluation. Stock market forecasting techniques are used to get the deviation between the real and forecasted datasets to minimize the loss. The last added stage, which is the money evaluation, is analyzing the capability of the financial forecast system rather than generating profits. The significant phase is to maximize the profit with the help of prediction of the stock market.