Stock return forecasting: U.S. S&P500

For hundreds of years, investors have been fascinated by the variability of speculative prices. Investment practitioners have developed many a tool to forecast the future path of the prices in the hope that they can make great fortunes with good forecasts. One of the most commonly used forecasting tools is technical analysis. Despite its popularity among technicians, the value of technical analysis is still controversial due to its subjective nature. In contrast to fundamental analysis, which was quick to be adopted by the scholars of modern quantitative finance, academic scrutiny of technical analysis is still in its infancy. It has been argued that the difference between fundamental analysis and technical analysis is not unlike the difference between astronomy and astrology. Among some circles, technical analysis is even known as “voodoo finance”.

However, several academic studies suggest that technical analysis may well be an effective means for extracting useful information from market prices. For example, Lo and MacKinlay (1988, 1999) have shown that past prices may be used to forecast future returns to some degree, a fact that all technical analysts take for granted. Studies by Tabell and Tabell (1964), Treynor and Ferguson (1985), Brown and Jennings (1989), Jegadeesh and Titman (1993), Blume et al. (1994), Chan et al. (1996), Lo and MacKinlay (1997), Grundy and Martin (1998), and Rouwenhorst (1998) have also provided indirect support for technical analysis, and more direct support has been given by Pruitt and White (1988), Neftci (1991), Brock et al. (1992), Neely et al. (1997), Neely and Weller (1998), Chang and Osier (1994), Osier and Chang (1995), and Allen and Karjalainen (1999). Lo et al. (2000) found that over the 31-year sample period, several technical indicators do provide incremental information and may have some practical value. Recent academic results show that high- low price extremes are valuable for forecasting speculative prices (Xie et al., 2012; Xie et al., 2012; Xie and Wang, 2013; Xie et al., 2013; Xie et al., 2014; Xie et al., 2015; Xie and Wang, 2018).

Despite the growing evidence supporting the practical value of technical analysis, it is still frequently criticized due to its highly subjective nature and its lack of theoretical underpinnings.

In this chapter, we are going to scrutinize the forecasting power of the candlestick using the DVAR model proposed in Chapter 5. The main purpose of this chapter is to see if the statistical properties of the candlestick can be used to improve stock return forecasting.

10.1 Introduction

The predictability of stock market returns is of great interest to both academic researchers and investment practitioners, and numerous economic and financial variables have been identified as predictors of stock returns in academic literature. Examples include valuation ratios, such as the dividend-price (Dow, 1920; Fama and French, 1988, 1989), earnings-price (Campbell and Shiller, 1988, 1989), and book-to-market (Kothari and Shanken, 1997; Pontiff and Schall, 1998), as well as nominal interest rates (Fama and Schwert, 1977; Campbell, 1987; Breen et al., 1989; Ang and Bekaert, 2007), the inflation rate (Nelson, 1976; Fama and Schwert, 1977; Campbell and Vuolteenaho, 2004), term and default spreads (Campbell, 1987; Fama and French, 1989), corporate issuing activity (Baker and Wurgler, 2000; Boudoukh et al., 2007), consumption- wealth ratio (Lettau and Ludvigson, 2001), stock market volatility (Guo, 2006).1 Almost all of these existing studies focus on the in-sample tests and conclude significant evidence of in-sample return predictability.

Despite the consistent agreement on the in-sample predictability of stock returns, evidence of out-of-sample predictability remains controversial. Bossaerts and Hillion (1999), Ang and Bekaert (2007), and Goyal and Welch (2003) casted doubt on the in-sample evidence documented by the early authors by showing that these variables have negligible out-of-sample predictive power. Among these studies, Goyal and Welch (2008) take a comprehensive look at the empirical performance of stock returns, and show that a long list of predictors from the literature is unable to deliver consistently superior out-of-sample forecasts of the U.S. stock returns relative to a simple historical mean forecast. In contrast, recent empirical studies confirm new predietor variables and econometric methods that can improve the out-of-sample predictability of stock returns. New predictor variables include technical indicators (Neely et al., 2014; Han et al., 2013; Goh et al., 2013; Huang and Zhou, 2013), sentiment index (Baker and Wurgler, 2006; Stambaugh et al., 2012; Huang et al., 2013). New econometric methods include support vector machine method (Huang et al., 2005), economically motivated model restrictions (Campbell and Thompson, 2008; Ferreira and Santa-Clara, 2011), combination forecast (Rapach et al., 2010), diffusion index (Ludvigson and Ng, 2007; Kelly and Pruitt, 2012; Neely et al., 2014), regime shifts (Guidolin and Timmermann, 2007; Henkel et al., 2011; Dangl and Hailing, 2012), sequential learning (Johannes et al., 2014). Rapach and Zhou (2013) made a more extensive survey of the vast literature on predicators and methodologies concerning stock return predictability.

Different from the existing the methods, we use the DVAR model proposed in Chapter 5 to scrutinize the the out-of-sample predictability of the U.S. stock market and the economic value of candlestick forecasting. Out-of-sample predictability of the U.S. stock market is performed on the monthly stoek returns of the U.S. S&P500 index over 1995.01-2015.12. To mitigate the concern of data-mining, we use the out-of-sample Я-square of Campbell and Thompson (2008) as the statistical performance measure and the certainty equivalent return (CER) gain as the economic performance measure. We find the DVAR model reports significant out-of-sample predictability for the U.S. stock market.

The rest of the chapter is organized as follows. Section 2 describes the econometric methodologies. Section 3 presents the empirical results on the predictability' of the U.S. stock market. Section 4 presents some details of U.S. stock market predictability. A summary' is presented in Section 5.

  • 10.2 Econometric methods
  • 10.2.1 The model

For univariate time series modeling, one of the most commonly used benchmark models is the ARMA-GARCH-in-Mean model (see Eq. (5.3), p.28). This model can simultaneously capture the linear autocorrelation in return series and the risk- return tradeoff. In this chapter, only the following ARMA-GARCH-in-Mean of order (1, 1) is used as it has been well documented that the order (1, 1) is sufficient to capture the return dynamics.

where у is the coefficient of relative risk aversion, reflecting the risk-return tradeoff.

Another model used in this chapter is the DVAR model proposed in Chapter 6. A DVAR model of order p is given by

where T, = (ARr, А И7,)7, Xt_i is a vector of exogenous variables. The exogenous variables used in this chapter include the upper shadow, the lower shadow, and other variables. Upper shadow and lower shadow are used because we have shown in Chapter 6 that these two variables are informative for predicting both ARt and AW,. For more about DVAR, readers can refer to Chapter 5.

Note that the DVAR model does not directly predict the stock return. To obtain the return forecasts, we proceed in an indirect way. To be specific, the return forecasts are constructed through the following equation

where r{ is the return forecast, ДЩ and A Wf are respectively the forecasts of А1Ц and Д W{ reported by the DVAR model.

10.2.2 Out-of-sample evaluation

A potential problem with in-sample predictability is over-fitting. In a comprehensive study, Goyal and Welch (2008) found that many macroeconomic variables, though they deliver significant in-sample forecasts, perform poorly out of sample. Following Goyal and Welch (2008), we also use the out-of-sample R2, R2aot statistic which is defined as

where r{(m) is the return forecast given by model m, r, is the historical mean forecast given by

The historical mean forecast is equivalent to an efficient market model, and thus serves as a natural benchmark. A positive R2M indicates the better performance of model m forecast over the simple historical mean forecast, while a negative K]uu indicates the opposite.

The R2 statistic measures the reduction in mean squared forecast error (MSFE) for model m relative to the historical mean forecast. To see if the reduction is significant, we test the null hypothesis that Rrm < 0 against the alternative hypothesis that Rrm > 0. Following Rapach et al. (2010), we also test this hypothesis by using the Clark and West (2007) MSFE-adjusted statistic. Define

the Clark and West (2007) MSFE-adjusted statistic is the r-statistic from the regression of/) on a constant.

10.3 Statistical evidence

This section describes the data and presents the in-sample and out-of-sample forecasting results.

10.3.1 The data

We scrutinized the monthly return predictability of the U.S. stock market using the S&P500 index. The data spans from January 1950 through December 2015 with 792 observations and was downloaded from the website, www.finance.yalioo.com.

For each month, the high, low, close and open prices were reported. The risk-free bill rates came from www.bus.emory.edu/AGoyal/Research.html. From these prices, we constructed the lower shadow (LS,) by Eq. (3.5)-(3.6), the upper shadow (US,) by Eq. (3.3)-(3.4), ДА, and Д W, by Eq. (5.6)-(5.7), and die asset returns.

Candlestick chart forecasting is often used with market states. We define the market is in up-trend state if the market index is above its 200-day moving average and in down-trend state otherwise. The 200-day moving average has been widely used by practitioners and is available in investment letters, trading software, and newspapers, which can thus mitigate the concerns of data mining and data snooping. The 200-day moving average has also been used in Huang et al. (2013) to define market states. To be specific, the market state variable, ms, is constructed from the daily stock price, given by

where P, is the daily price level of the market index.2

Table 10.1 reports the summary statistics of stock returns together with some other variables. We find no significant linear autocorrelation in the return series, which is consistent with our intuition that the U.S. stock market is quite efficient. For the other time series variables, the Ljung-Box Q statistics report significant linear autocorrelations. The ADF statistics show that all these time series are stationary.

10.3.2 In-sample estimation

Table 10.2 reports the estimates of the ARMA-GARCH-in-Mean model. Panel A reports the estimates of the mean equation in the ARMA-GARCH-in-Mean model and Panel В presents the estimates of the GARCH equation. The results indicate

Table 10.1 Summary statistics of stock return and other variables

r,

Д R,

AW,

Is,

us,

Mean

0.006

0.006

0.000

0.018

0.014

St.D.

0.042

0.035

0.32

0.021

0.012

Max.

0.151

0.125

0.154

0.190

0.072

Min.

-0.245

-0.190

-0.110

0.000

0.000

Kurt.

5.435

6.324

4.320

16.857

4.731

SKew

-0.655

-1.017

0.406

2.907

1.313

Jarque-Bera

251.91***

500.579***

79.117***

7452.278***

326.507***

Ljung-Box

Q(12)

16.412

78.184***

227.59***

102.82***

330.29***

ADF

-26.802***

-18.591***

-15.199***

-15.733***

-7.948***

We use ***, **, and * to mean significance at the level of 1%, 5%, 10%.

Table 10.2 Estimates of the ARMA-GARCH-in-Mean model

Panel A: The Mean Equation

ARMA-GARCH-in-Mean

ARM AX-GA RCH-in -Mean

Coef.

r-Statistic

Coef.

t-Statistic

c

-0.001

-0.106

-0.003

-0.579

a

-0.970

-137.651

-0.970

-120.160

p

0.997

533.805

0.997

367.531

mst_i

0.005

1.813

У

0.200

0.999

3.456

1.504

Panel B: The GARCH Equation

CO

9.61E-05

2.568

9.89E-05

2.550

a

0.834

26.398

0.834

25.494

p

0.115

4.583

0.113

4.535

R-squarc

R-squarc

0.015

0.024

Table 10.3 Estimates of the DVAR model: S&P500

Д R,

A IT,

Coef

f-Statistic

Coef.

r-Statistic

c

-0.003

-1.045

0.004

1.982

a

0.475

12.764

-0.599

-19.183

p

0.524

16.439

-0.505

-18.849

I*,-!

0.473

7.815

-0.607

-11.949

us,-1

-0.484

-6.173

0.597

9.070

ms,, i

0.006

2.467

0.004

1.848

Г,-1 * lus,-1

2.528

4.416

0.335

0.696

R-squarc

R-squarc

0.465

0.541

there is a significant volatility clustering effect in stock return. Also we can see from the estimates of the mean equation that there is a positive but insignificant risk-return tradeoff. The low Л-square statistics (A2 = 1.5%) demonstrate that the forecasts reported by the ARMA-GARCH-in-Mean model contain almost no information about the true return observations. We also present in Table

10.2 the estimates of the ARMA-GARCH-in-Mean model with the state variable, msr_l. The coefficient on state variable is reported to be 0.005 and statistically significant at the level of 10%. The Л-square reported by ARMAX- GARCH-in-Mean model sharply increases to 2.4%. This result means that the state variable is very informative for forecasting stock return.

Table 10.3 presents the estimates of the DVAR model. The lag k= 1 in the DVAR model is chosen by the SIC criteria. Consistent with the theoretical results given in Chapter 6, we find both upper shadow and lower shadow are informative for predicting All, and Д If,. Also we find the state variable is very important for forecasting ДR, and Д Wt. The high R-square statistics indicate the high predictability' of ARt and AWr. We also calculate the in-sample R-square, Rrin using

where r is the mean of r, over the whole sample, and is the forecast given by DVAR model (see Eq. (10.3)). The in-sample R-square reported by DVAR model is 2.98%.

10.3.3 Out-of-sample forecast

For out-of-sample forecasting, the total T observations are divided into two portions. The first portion from 1 to M observations is used to estimate the coefficients, and the remaining portion from M + 1 to T is used for forecasting evaluation.

A static forecasting procedure is used. To be specific, we first use the M observations to obtain the estimates of the parameters, and then make out-of-sample- forecasts with these estimates being fixed. In other words, the estimates of the parameters in the DVAR model are not updated with new information

where C, At and T, are estimates of the parameters using the only the first M observations. We use the data over 1995.01-2015.12 as the out-of-sample time period.

We compute the cumulative squared forecast error of each competing model

where S4 is the cumulative squared forecast error, r, and r{ (m) are respectively the return observation and the return forecast given by model m.

Figure 10.1 presents the time series plots of the cumulative squared forecast error. We use CUM_M to mean the cumulative squared forecast error reported by model M. It is clear that the DVAR model reports the lowest cumulative squared forecast error and then comes the ARMA-GARCH-in-Mean model, and the historical mean (HM) model has the largest cumulative squared forecast error.

Following Goyal and Welch (2008), we also compute the difference between the cumulative squared forecast error for the historical mean model and the

Time series of cumulative squared forecast error

Figure 10.1 Time series of cumulative squared forecast error: 1995.01-2015.12

cumulative squared forecast error for the competing model, presents the time series plots of the differences where I)S4 is a series of difference.

Figure 10.2 presents the time series plots of the differences. We use HM_M to mean the difference between the cumulative squared forecast error for the historical mean model and the cumulative squared forecast error for the competing model M. This is an informative graphical device that provides a visual impression of the consistency of a competing model’s out-of-sample forecasting performance relative to the historical mean model over time. When the curve increases, the competing model outperforms the historical mean model, while the opposite holds when the curve decreases. The plots conveniently illustrate whether a competing model has a lower mean squared forecast error (MSFE) than the historical mean model for any particular out-of-sample period by redrawing the horizontal zero line to the start of the out-of-sample period. A competing model that always dominates the historical mean model for any out-of-sample period will have a curve with a slope that is always positive; the closer a competing mode is to this ideal, the greater its ability to consistently beat the historical mean model in terms of MSFE.

Cumulative squared forecast error for the historical mean benchmark forecasting model minus the cumulative squared forecast error for the competing model

Figure 10.2 Cumulative squared forecast error for the historical mean benchmark forecasting model minus the cumulative squared forecast error for the competing model: 1995.01-2015.12

Several findings emerge from Figure 10.2. First, both DVAR and ARMA- GARCH-in-Mean models outperform the historical mean benchmark model as the curves have ending points higher than starting points. Second, DVAR outperforms ARMA-GARCH-in-Mean since the ending point of the DVAR curve is higher than the ending point of the ARMA-GARCH-in-Mean curve. Third, the DVAR model is more robust than the ARMA-GARCH-in-Mean model for out-of-sample forecasting as the DVAR curve is less volatile than the ARMA-GARCH-in-Mean curve.

The out-of-sample R-squares reported by the DVAR model and the ARMA- GARCH-in-Mean are respectively 4.82% and 1.70%, indicating outperformance of DVAR and ARMA-GARCH-in-Mean over the historical mean. To see if the outperformance is statistically significant, we calculate the MSFE-adjusted r-statistic by Eq. (10.6). The MSFE-adjusted r-statistic for the DVAR is 2.633, which is significant at the level of 1%. The MSFE-adjusted r-statistic for the ARMA-GARCH- in-Mean model is 1.612, which is not significant at the level of 10%.

10.4 Economic evidence

A limitation to the K2ml measure is that it does not explicitly account for the risk borne by an investor over the out-of-sample period. To address this, following

Campbell and Thompson (2008), we also calculate realized utility gains for a mean-variance investor on a real-time basis. More specifically, we first compute the average utility for a mean-variance investor with relative risk aversion parameter у who allocates his or her portfolio monthly between stocks and risk-free bills using forecasts of the equity premium based on the historical sample mean. This exercise requires the investor to forecast the variance of stock returns, and similar to Campbell and Thompson (2008), we assume that the investor estimates the variance using a ten-year rolling window. A mean-variance investor who forecasts the equity premium using the historical average will decide at the end of period t to alloeate the following share of his or her portfolio to equities in period r+1:

where tft+ j and o2l+l are respectively risk-free rate and the rolling-window estimate of the variance of stoek returns.3 Over the out-of-sample period, the investor realizes an average utility level of

where fi() and a~a are the sample mean and variance, respectively, over the out-of- sample period for the return on the benchmark portfolio formed using forecasts of the equity premium based on the historical sample mean.

We then compute the average utility for the same investor when he or she forecasts the equity premium using the DVARor ARMA-GARCH-in-Mean model. He or she will choose an equity share of

and realizes an average utility level of

where (ip and a-f are the sample mean and variance, respectively, over the out-of- sample period for the return on the portfolio formed using forecasts of the equity premium based on DVAR or ARMA-GARCH-in-Mean model.

We measure the utility gain as the difference between Eq. (10.15) and Eq. (10.13), and we multiply this difference by 1200 for monthly observations to express it in an average annualized percentage return. The utility gain (or certainty equivalent return, CER) can be interpreted as the portfolio management fee that an investor would be willing to pay to have access to the additional information available in a predictive regression model relative to the information in the historical sample mean. We report results for у = 3; the results are qualitatively similar for other reasonable у = values.

Table 10.4 presents the realized utilities of the historical mean model, DVAR model and ARMA-GARCH-in-Mean model and the CER gains relative to the historical mean (HM). The realized utility of DVAR model is 0.618 which is almost twice as much as that (0.313) of the ARMA-GARCH-in-Mean model. Of course, both earn a higher realized utility' than the HM model has. The annual CER gain of the DVAR (ARMA-GARCH-in-Mean) model in 5.098 (1.443). We also calculate the sharp ratio of each trading strategy, and the results are presented in Table 10.4. We find clear dominance of the DVAR model over the other models. For comparison, we also calculate the realized utility' of CER gain of the simple buy-and-hold (BH) trading strategy. We find a trading strategy based on the DVAR model even outperforms the BH method.

Figure 10.3 presents the dynamic weights allocated on equities by different models. Figure 10.4 presents the out-of-sample cumulative returns given by

Table 10.4 Realized utilities and CER gains

HM

ARMA-GARCH-in-Mean

DVAR

BH

Sharp ratio

0.051

0.095

0.172

0.088

Realized utility' (%)

0.193

0.313

0.618

0.303

CER gain

-

1.443

5.098

1.322

Dynamic weights allocated on equities over time

Figure 10.3 Dynamic weights allocated on equities over time: 1995.01-2015.12

Dynamic cumulative portfolio returns formed by different trading strategies over time

Figure 10.4 Dynamic cumulative portfolio returns formed by different trading strategies over time: 1995.01-2015.12

different trading strategies. Figure 10.4 shows clear dominance of the trading strategy based on the DVAR forecast over the other trading strategies.

10.5 More details

Rapach et al. (2010) found the predictability of the U.S. stock market depends highly on business cycle: stock returns are more predictable in recession than in expansion.

To see how the predicting power of the DVAR model is related to the business cycle. We also divide the out-of-sample forecasts by the NBER-dated business cycle phases.4 For economic expansion there are 226 months, and for economic recession there are 26 months.

Figure 10.5 presents the cumulative squared forecast error of the DVAR model over expansion in left panel and over recession in right panel. It is clear that the DVAR model has lower forecast error than the historical mean model in both economic expansion and recession. Figure 10.6 presents the difference between the cumulative squared forecast error for the historical mean benchmark forecasting model and the cumulative squared forecast error for the DVAR model over expansion in the left panel and over recession in the right panel. It seems that the outperformance of the DVAR model over the historical mean

Time series of cumulative squared forecast error over business cycle

Figure 10.5 Time series of cumulative squared forecast error over business cycle

Cumulative squared forecast error for the historical mean benchmark forecasting model minus the cumulative squared forecast error for the DVAR model over business cycle

Figure 10.6 Cumulative squared forecast error for the historical mean benchmark forecasting model minus the cumulative squared forecast error for the DVAR model over business cycle

model is quite robust, regardless of the economic cycle since the curve increases in a quite steady way.

We also calculate the MSFE-adjusted r-statistics by Eq. (10.6) over business cycle. The MSFE-adjusted ^-statistic over expansion is 2.293, which is significant

Dynamic cumulative portfolio return formed by different trading strategies over business cycle

Figure 10.7 Dynamic cumulative portfolio return formed by different trading strategies over business cycle

at the level of 5%. The MSFE-adjusted r-statistic over recession is 1.781, which is significant at the level of 10%. This result confirms that the forecasts given by the DVAR model significantly dominate those reported by the historical mean model over both economic expansion and recession.

To see if there is any difference in economic value of the DVAR model relative to the historical mean model over the business cycle we compute the CER gains. The CER gain over expansion is 2.654 and 25.638 over recession. This result indicates that the forecasts reported by the DVAR model are more valuable in economic recession than in economic expansion. Figure 10.7 presents the out- of-sample cumulative returns over expansion in the left panel and over recession in the right panel.

10.6 Summary

This chapter scrutinizes the performance of the candlestick in return forecasting using the DVAR model. The empirical study is performed on the monthly S&P500 index. The results show that the DVAR model outperforms both the historical mean model and the ARMA-GARCH-in-Mean model. Moreover, we find the outperformance is not only statistically significant but also economically valuable. Further evidence indicates that the dominance of the DVAR model is robust to the business cycle.

The results obtained in this chapter provide statistical evidence that the candlestick chart is valuable for predicting stock returns. We believe that the evidence presented in this chapter confirms, more or less, that candlestick chart

forecasting is not a “voodoo” finance.

Notes

  • 1 The list of studies is not meant to be exhaustive; see Goyal and Welch (2008) for more extensive surveys of the vast literature on return predictability.
  • 2 The market state of next month is determined by the last trading day’s 200-day moving average indicator of current month.
  • 3 Following Campbell and Thompson (2008), we constrain the portfolio weight on stocks to lie between 0% and 150% (inclusive) each month, so that a)f) r = 0 (w0,t = 1.5) if ft>0,r < 0 (£U0,f > 1.5) in Equation (10.12).
  • 4 According to the NBER-datcd business cycle phases, the period 2001.04-2001.11 and the period 2008.01-2009.06 are used as economic recession periods.
 
Source
< Prev   CONTENTS   Source   Next >