 # What is Arbitrage Pricing Theory?

The Arbitrage Pricing Theory (APT) of Stephen Ross (1976) represents the returns on individual assets as a linear combination of multiple random factors. These random factors can be fundamental factors or statistical. For there to be no arbitrage opportunities there must be restrictions on the investment processes.

Example

Suppose that there are five dominant causes of randomness across investments. These five factors might be market as a whole, inflation, oil prices, etc. If you are asked to invest in six different, well-diversified portfolios then either one of these portfolios will have approximately the same risk and return as a suitable combination of the other five, or there will be an arbitrage opportunity.

Modern Portfolio Theory represents each asset by its own random return and then links the returns on different assets via a correlation matrix. In the Capital Asset Pricing Model returns on individual assets are related to returns on the market as a whole together with an uncorrelated stock-specific random component. In Arbitrage Pricing Theory returns on investments are represented by a linear combination of multiple random factors, with associated factor weighting. Portfolios of assets can also be decomposed in this way. Provided the portfolio contains a sufficiently large number of assets, then the stock-specific component can be ignored. Being able to ignore the stock-specific risk is the key to the 'A' in 'APT.'

We write the random return on the ith asset as where the Rj are the factors, the a's and /3's are constants and ei is the stock-specific risk. A portfolio of these assets has return where the can be ignored if the portfolio is well diversified.

Suppose we think that five factors are sufficient to represent the economy. We can therefore decompose any portfolio into a linear combination of these five factors, plus some supposedly negligible stock-specific risks. If we are shown six diversified portfolios we can decompose each into the five random factors. Since there are more portfolios than factors we can find a relationship between (some of) these portfolios, effectively relating their values, otherwise there would be an arbitrage. Note that the arbitrage argument is an approximate one, relating diversified portfolios, on the assumption that the stock-specific risks are negligible compared with the factor risks.

In practice we can choose the factors to be macroeconomic or statistical. Here are some possible macroeconomic variables.

• an index level

• GDP growth

• an interest rate (or two)

• a default spread on corporate bonds

• an exchange rate.

Statistical variables come from an analysis of a covariance of asset returns. From this one extracts the factors by some suitable decomposition. The main differences between CAPM and APT is that CAPM is based on equilibrium arguments to get to the concept of the Market Portfolio, whereas APT is based on a simple approximate arbitrage argument. Although APT talks about arbitrage, this must be contrasted with the arbitrage arguments we see in spot versus forward and in option pricing. These are genuine exact arbitrages (albeit the latter being model dependent). In APT the arbitrage is only approximate.

Ross, S 1976 The arbitrage theory of Capital Asset Pricing. Journal of Economic Theory 13 341-360

# What is Maximum Likelihood Estimation?

Maximum Likelihood Estimation (MLE) is a statistical technique for estimating parameters in a probability distribution. We choose parameters that maximize the apriori probability of the final outcome actually happening.

Example

You have three hats containing normally distributed random numbers. One hat's numbers have a mean of zero and a standard deviation of 0.1. This is hat A. Another hat's numbers have a mean of zero and a standard deviation of 1. This is hat B. The final hat's numbers have a mean of zero and a standard deviation of 10. This is hat C. You don't know which hat is which.

You pick a number out of one hat. It is —2.6. Which hat do you think it came from? MLE can help you answer this question.

A large part of statistical modelling concerns finding model parameters. One popular way of doing this is Maximum Likelihood Estimation.

The method is easily explained by a very simple example. You are attending a maths conference. You arrive by train at the city hosting the event. You take a taxi from the train station to the conference venue. The taxi number is 20,922. How many taxis are there in the city?

This is a parameter estimation problem. Getting into a specific taxi is a probabilistic event. Estimating the number of taxis in the city from that event is a question of assumptions and statistical methodology.

For this problem the obvious assumptions to make are:

1. Taxi numbers are strictly positive integers

2. Numbering starts at 1

3. No number is repeated

4. No number is skipped.

We will look at the probability of getting into taxi number 20,922 when there are N taxis in the city. This couldn't be simpler, the probability of getting into any specific taxi is Which N maximizes the probability of getting into taxi number 20,922? The answer is

N = 20,922.

This example explains the concept of MLE: Choose parameters that maximize the probability of the outcome actually happening.

Another example, more closely related to problems in quantitative finance, is the hat example above. You have three hats containing normally distributed random numbers. One hat's numbers have a mean of zero and a standard deviation of 0.1. This is hat A. Another hat's numbers have a mean of zero and a standard deviation of 1. This is hat B. The final hat's numbers have a mean of zero and a standard deviation of 10. This is hat C.

You pick a number out of one hat, it is —2.6. Which hat do you think it came from?

The 'probability' of picking the number —2.6 from hat A (having a mean of zero and a standard deviation of 0.1) is Very, very unlikely!

(N.B. The word 'probability' is in inverted commas to emphasize the fact that this is the value of the probability density function, not the actual probability. The probability of picking exactly —2.6 is, of course, zero.)

The 'probability' of picking the number —2.6 from hat B (having a mean of zero and a standard deviation of 1) is and from hat C (having a mean of zero and a standard deviation of 10) We would conclude that hat C is the most likely, since it has the highest probability for picking the number —2.6.

We now pick a second number from the same hat. It is 0.37. This looks more likely to have come from hat B. Table 2.2 shows the probabilities.

The second column represents the probability of drawing the number —2.6 from each of the hats; the third column represents the probability of drawing 0.37 from each of the hats; and the final column is the joint probability, that is, the probability of drawing both numbers from each of the hats.

Table 2.2: Probabilities and hats.

 Hat —2.6 0.37 Joint A 6 10—147 0.004 2 10—149 B 0.014 0.372 0.005 C 0.039 0.04 0.002

Using the information about both draws, we can see that the most likely hat is now B.

Now let's make this into precisely a quant finance problem.

## Find the volatility

You have one hat containing normally distributed random numbers, with a mean of zero and a standard deviation of a which is unknown. You draw N numbers <i from this hat. Estimate a.

Q. What is the 'probability' of drawing 4>i from a Normal distribution with mean zero and standard deviation a?

A. It is Q. What is the 'probability' of drawing all of the numbers 01,02, <t>N from independent Normal distributions with mean zero and standard deviation a?

A. It is Now choose the a that maximizes this quantity. This is easy. First take logarithms of this expression, and then differentiate with respect to a and set result equal to zero: (A multiplicative factor has been ignored here.) That is: Therefore, our best guess for a is given by You should recognize this as a measure of the variance.

## Quants' salaries

Figure 2.4 shows the results of a 2004 survey on wilmott.com concerning the salaries of quants using the Forum (or rather, those answering the question!). This distribution looks vaguely lognormal, with distribution If you are a professional 'quant,' how much do you earn?

Last year I earned: Figure 2.4: Distribution of quants' salaries.

where E is annual earnings, a is the standard deviation and E0 the mean. We can use MLE find a and E0.

It turns out that the mean E0 = \$133,284, with a = 0.833. 