# Methodologies for modelling risk factors as interactions

Progresses in genetics have been accompanied by parallel progresses in methodologies applied by geneticists and biostatisticians to identify genes and decipher complex polygenic traits. The genetics-based model opens the door to many exciting methodologies, at least in principle. It makes no doubt that their application to a purely theoretical model such as the one presented above requires some adapting.

In his presentation of factor-based macro-markets for hedging real estate and other assets, Shiller (1993) specifically mentions MIMIC (multiple indicators, multiple cause) models to “estimate factors as functions of other variables”. These models which are applied in economics (e.g. in studies of factors contributing to economic performance) bring us back to genetics inasmuch as they originated in methodologies initially developed by genetics pioneer Sewall Wright (1889-1988).

Shiller notes that applying multivariate modelling techniques to identify factors is not devoid of problems:

In practice, the idea of such modelling would have to be pursued with care, since the methods might produce factors that have no simple intuitive base, and there may be scepticism that an estimated factor structure will continue to hold up in the indefinite future. A less formal approach to identifying factors on which to base markets [...] may suffice.

It is interesting to note that among the shortcomings of factor-based hedging instruments for real estate assets, Shiller singles out the potential lack of dynamism of the underlying model. This chapter addresses this important point in the next section. Another point raised by Shiller (1993) with respect to factor-analytic models is the impact of interaction effects. Because of them, “the quality of the property would have different effect on price in different dates”. Indeed, identifying and qualifying interactions among variables (in pairs, threes, or more) should be central in any selected methodologies. It is also part of the model’s dynamism.

Another issue facing researchers derives from the fact that hypothesis testing in economics has to rely on uncontrolled experiments (Friedman, 1966). In an uncontrolled experimental environment, establishing causal connections cannot be easily achieved compared to the way biologists customarily carry out controlled experiments in line with 19th-century physiologist Claude Bernard’s seminal *Study of Experimental Medicine* (Shipley, 2016). Mindful of “the basic confusion between descriptive accuracy and analytical relevance”, Friedman (1966) insists that in economics, “[a meaningful scientific hypothesis or theory] cannot be tested by comparing its assumptions directly with reality”, but by putting it in the context of “other hypotheses dealing with related phenomena”?®

Methodologies presented in this book are derived from methods customarily applied in quantitative genetics (variance partitioning) and population genetics (path analysis). The trait under study (i.e. Total Return) is a continuous quantitative trait. The objective is to shed some light on variations in this trait over time (i.e. same building over time) and in space (comparative analysis of different buildings in different locations). The following section introduces Variance Partitioning. Path analysis is presented in Appendix 1.5.

## FUNDAMENTALS OF TOTAL RETURN VARIANCE DECOMPOSITION AT POPULATION LEVEL

To deal with risk factors which are essentially interaction effects, the genetics-based model of real estate risk relies on a parametric approach. It analyses property assets according to its own proforma framework (e.g. the genotype/ phenotype dichotomy) and proposed structural models (e.g. 3 chromosomes x 15 genes x environment), which ought to be adapted depending on each property’s idiosyncrasies. In the process, the genetics-based model turns commercial real estate into a highly idiosyncratic, though conceptually homogeneous, abstraction interacting with the environment. The aim here is to qualify and explain selected genes’ interactions (i.e. risk factors in the model) in driving variations in total return.

Variance decomposition is well suited to identify interaction effects at the core of the model’s dynamism. It has been applied in quantitative genetics to study variation in the genetics of a metric character at the population level (i.e. large sample of buildings in a neighbourhood). The basic idea is to partition variation into components attainable to different causes.

In the genetics-based model, the only determinants of a property’s total return (P) are the genotype (G) and the environment (E):

P = G + E (1)

Environment encapsulates “all the non-genetic circumstances that influence the phenotypic value” (Falconer and Mackay, 1996). A property asset’s genotypic value G is itself made up of several components:

- - The
*Genotypic value of*each gene involved in the phenotype: the genotypic value of two genes is supposed to be additive. A is the sum of all genotypic values attributable to separate genes; - - The
*Dominance deviation*D in case one or more genes are dominant over other genes involved in the interaction; - - The
*Interaction deviation 1*if genes interactions affect the phenotype. I is the “deviation from additive combination of these genotypic values” involved in A above. If I = 0, then the genes in pairs, threes or higher numbers are purely additive.

Hence, G = A + D+I

- (2)
- (3)
- (4)

In terms of variance, the model yields:

V_{P}=V_{G}+V_{E}

Or Vp = V_{A} + Vp+Vj + V_{E}

The ratio V_{G}/Vp gives an estimation of genetic determination in the population, i.e. similarity of physical structure, lease characteristics, and location. V_{A}, the additive variance, is an indication of the degree of resemblance between properties in an experimental population.

V], the interaction variance, captures the variance of the interaction deviations. In the simple case of two genes involved in an interaction, there are three sorts of two-factor interactions:

- - Additive (A) Additive (A) if the two genes are additive
- - Additive (A) Dominant (D) if one gene is additive and the other dominant
- - Dominant (D) Dominant (D) if both genes are dominant

With V[ = V_{AA}+V_{AD} + V_{DD}

(5)

Considering that in the genetics-based model, risk factors are interactions, one can expect V] to account for the hulk of Vq, and ultimately Vp.

## CORRELATION AND INTERACTION BETWEEN A PROPERTY ASSET'S GENOTYPE AND THE ENVIRONMENT

It is highly likely that there is a correlation between genotypic value and environmental deviation in the model. That is, the better buildings in terms of physical structure tend to be built in the better environment. Furthermore, the better buildings in terms of total return for investors will tend to attract more development of buildings with similar characteristics in similar environments, thereby improving the overall quality of the correlation between genotype and environment.

As a result, Vp = Vq + Vp + 2COVge

Falconer and Mackay (1996) explain that when this occurs for human phenotypes, as the covariance is unknown in practice, “an individual’s environment can be thought of as part of its genotype”, which is in effect the case in the genetics model where location, in the broad acceptance of the term, is part of a property asset’s genotype.

In Equation 1, we assume that interactions between genotype and the environment are the same irrespective of the genotypes on which the latter acts. This is obviously not realistic. In practice, specific differences in environmental factors might have different effects on various phenotypes depending on the environmental sensitivity of a genotype (aka “reaction norm”). For instance, an office building located in the City of London might show larger total return than another seemingly similar office building in the City, but smaller returns than another similar building in Paris’ Golden Triangle. Thus, Equation 1 should be rewritten as follows:

P = G+E + Ige (7)

where Iqe captures interaction effects between the genotype and the environment,

and Vp = Vg + Vp + 2COVge + Vge (8)

To qualify whether specific environments are more or less favourable for expression of the total return trait in a property, one can compute the *environmental sensitivity,* by regressing the genotype value (G) on the environmental value (equal to the mean of all genotypes in that environment).

## APPLICATION OF VARIANCE ANALYSIS AT THE PROPERTY LEVEL

Variance decomposition can also be applied at the property level by breaking down variance in total returns into a component written within each property

*(within-property component)* and another component measuring differences between properties *(between-property component).* The within-property component is due to changes in the environment (all other things being equal in terms of genotype) whereas the between-property component measures the permanent differences between properties in a population.

Falconer and Mackay (1996) notes that “by this analysis, the variance due to temporary environmental circumstances is separated from the rest, and can be measured”. The within-property variance fully results from the environment owing to temporary or localised circumstances. It is called the *Special Environment Variance,* V_{E}g.

In parallel, VgG, the *General Environment Variance,* captures the variance of the permanent between-property component with:

Ve=Ves + Veg (9)

The ratio in eq (10) is known as the repeatability of the character, r with

r = (Vc+V_{EG})/Vp (10)

The repeatability measures the proportion of the variance in total return due to the permanent, non-localised differences between properties, both genetic and environmental. This type of analysis could be applied at the MSA level. In the genetics-based model, V_{E}g refers to the space-time varying environment of the individual property level at the micro-scale.

To separate V_{EG}, the general environment variance, from Vq, the genotypic variance, Falconer et al. suggest to calculate repeatability in a genetically uniform group (i.e. similar buildings in the same neighbourhood with similar physical and lease structures). The between-property variance of this group of buildings is equal to the general environment component V_{EG} since V_{G} of a genetically uniform sample of buildings is close or equal to zero. Hence,

If Vg = 0 and Vp = Ve = Veg + Ves (11)

and the repeatability ratio r = Veg / (Veg + Ves) (12)

Partitioning the environment variance into general and specific environments makes it possible to focus on property level interactions with the environment. Vgs expresses the micro-scale dimension of location which involves the more granular locational variables (including the local real estate market’s impact on individual properties), whilst V_{EG} expresses the broader spatial dimension of the environment linked to less granular levels of location-related genes in the model.

Variance decomposition of a property negatively affected by time (obsolescence) can be expected to express many interactions with the grade gene on the physical structure chromosome. This gene might become dominant to the point of being lethal to the building. In particular, whenever environmental variance shifts from special (ES) to permanent (EG), this can spell disaster for unfit properties such as an ageing building?^{1}

Variance decomposition can help design effective factor-based hedges for commercial real estate assets, by deciphering the role played by each category of variables in explaining total return variance at the property level. Properties where Vj: dominates are easier to hedge than those where Vq dominates. The latter type might have structural issues whose origins can widely vary, from physical structure to location. Moreover, if Vgg dominates, one can expect more local real estate market related factors in optimal hedges. Conversely, if VgQ dominates Vg, one can expect more macro-variables in optimal hedges.

To design effective hedging instruments, researchers can use variance partitioning as a guiding tool for factor selection in factor-based real estate derivatives. Although not mainstream, variance decomposition is not new in commercial real estate studies, e.g. Wheaton (2015) who partitions the volatility in vacancy into demand/supply contributions for four property types in US MSAs. The methodology, a staple method in quantitative genetics, is well suited to real estate derivatives insofar as it focuses on the very phenomenon the genetics-based model aims to capture, i.e. variation in total return as a result of interactions.

Other methodologies derived from genetics and biological sciences such as path analysis and structural equation modelling based on maximum likelihood techniques (Shipley, 2016) are also worth considering for implementing the geneticsbased model. Appendix 1.5 presents an application of path analysis.