Implementation of the Data Analytics Methods for the Forecast
To predict season diseases, either statistical or structural models can be used (Hyndman and Athanasopoulos, 2018).
Statistical methods considered in this investigation include the following:
- • Classical Poisson method.
- • Dickson-Coles method.
Figure 9.5 Features of the decision tree.
- • Least squares method.
- • Autoregressive model of variable mean.
- • Model of simple exponential smoothing.
- • Holt exponential smoothing model.
- • Holt-Winters exponential smoothing model.
On the other hand, from the existing structural methods the following were selected:
- • Neural network with nonlinear autoregressive model.
- • Multilayer perceptron with five hidden layers.
- • Multilayer perceptron with automatic detection of the number of hidden layers.
- • Machine of extreme training.
Besides the methods mentioned above, another classification could be applied:
- • Regression models and methods.
- • Autoregressive models and methods.
- • Models and methods of exponential smoothing.
- • Neural network models and methods.
According to the suggested classification in the Table 9.1, there are systemized strong and weak sides of above-mentioned approaches.
For the estimation of the accuracy of prediction methods, time series forecasting error rates will be used (Hyndman and Koehler, 2006).
Table 9.1 Comparison of the Methods and Models
MODEL AND METHOD |
ADVANTAGES |
DISADVANTAGES |
Autoregressive models and methods |
Simplicity, uniformity of analysis and design; numerous application examples |
The complexity of model identification; impossibility of modelling nonlinearities; low adaptability |
Models and methods of exponential smoothing |
Simplicity, uniformity of analysis and design |
Insufficient flexibility; narrow applicability of models |
Neural network models and methods |
Nonlinearity of models; scalability, high adaptability; uniformity of analysis and design; large set of examples |
Lack of transparency; complexity of choice of architecture; stringent training sample requirements; the complexity of choosing a learning algorithm; resource-intensive learning process |
The most common time series forecasting errors are as presented below:
MAPE - mean absolute percentage error:
MAE - mean absolute error:
MSE - mean square error:
RMSE - root mean square error:
ME - mean error:
SD - standard deviation:
Forecast accuracy is an opposite concept to the prediction error. If the forecast error is large, then the accuracy is small and, conversely, if the prediction error is small, then the accuracy is large (Khair et al., 2017). In fact, the forecast error estimate is the inverse of the forecast accuracy - the dependence is simple here:
Forecast accuracy in% = 100% - MAPE (9.8)
Usually, the accuracy is not estimated, in other words, solving the task of forecasting is always evaluated, that is, determine the value of the prediction error, that is, the magnitude and the forecast error. However, it should be understood that if so, then the prediction accuracy = 95%. When talking about high accuracy, we always talk about low forecast error, and there should be no misunderstanding in this area.
In this case, the MAPE is a quantitative estimate of the error itself, and this value clearly tells us the accuracy of prediction, based on the above simple formula. Thus, when estimating the error, we always estimate the accuracy of the prediction.
According to Table 9.2, the best model is a neural network with a nonlinear autoregressive model. You can see the results for the forecast related to Figure 9.6.
Table 9.2 Comparison of Different Methods for the Colds Forecasts
METHOD |
MAPE |
MAE |
MSE |
RMSE |
ME |
SD |
Autoregressive model of variable mean |
7.1849 |
29.3399 |
2069.1879 |
45.48832 |
0.1427 |
45.532 |
Autoregressive model of variable mean (custom) |
6.8860 |
28.6838 |
1930.2694 |
43.9348 |
0.5027 |
43.9747 |
Model of simple exponential smoothing |
8.9315 |
39.2918 |
4190.0204 |
64.7303 |
-0.2436 |
64.7929 |
Holt exponential smoothing model |
7.9362 |
35.6543 |
3899.7392 |
62.4478 |
-0.3903 |
62.5075 |
Holt-Winters exponential smoothing model |
17.5070 |
75.3171 |
15462.2587 |
124.34 |
-0.9435 |
57.8242 |
Neural network with nonlinear autoregressive model |
5.5145 |
22.4491 |
1135.5233 |
33.6975 |
-0.043 |
34.0067 |
Multilayer perceptron with five hidden layers |
6.0216 |
24.7753 |
1425.9549 |
37.7618 |
-0.1517 |
37.2962 |
Multilayer perceptron with automatic detection of the number of hidden layers |
6.8004 |
28.7822 |
2142.1202 |
46.2830 |
0.00071 |
46.3384 |
Machine of extreme training |
7.3069 |
31.6591 |
2605.2752 |
51.0418 |
0.0116 |
51.7630 |
Figure 9.6 Forecast with neural network with nonlinear autoregressive model.
Implementation of the Data Analytics Methods for the Football Matches Forecasts
For the football matches forecasts let us consider following methods (Harville, 2003):
- • Classical Poisson method (Maher, 1982).
- • Dickson-Coles method (Dixon and Coles, 1997).
- • The method of time-independent least squares Ratings.
- • “Predicting football results using a neural network based on FIFA Rating” method (Graham, 2018).
Table 9.3 Comparison of Different Methods for the Football Matches Forecasts
METHOD |
MAPE |
MAE |
MSE |
RMSE |
ME |
SD |
Basic Poisson model |
9.93 |
42.03 |
4479.00 |
66.9253 |
0.2373 |
67.8978 |
Dixon-Coles method |
7.94 |
35.67 |
3900.74 |
62.4559 |
0.2451 |
63.1456 |
Mean square method |
7.80 |
34.68 |
2783.43 |
52.7582 |
0.2092 |
52.9896 |
Deep neural network |
6.51 |
27.55 |
2050.45 |
45.2819 |
0.1812 |
45.5678 |
According to Table 9.3, the best model is the neural network of deep learning (the lowest error rate is 6.51).