# Appendix 8: Points in a heterogeneous space

## Appendix 8.1: D-function-based test of spatial interactions

The (spatstat) function envelope that we introduced in Appendix 7 does not allow direct computation of the random labelling simulated confidence envelopes for Diggle and Chetwynd’s D-function. Therefore, we need to write our own R code. Let us start by loading demopat, a useful example dataset contained in the spatstat library:

> library(spatstat)

> data(demopat)

The demopat dataset is an artificial multitype point pattern, in the form of a ppp object characterized by two different types of points, labelled A and B. Let us suppose that we are interested in performing a D-function-based test of spatial interactions in which the points of type A are treated as cases and the points of type В are treated as controls. The empirical ЛГ-function for the cases and controls, К cases (d) and К controls (<0> can obtained using the function Kest, which at least requires a ppp object as an input, and hence typing

> KdemopatCases <- Kest(subset(demopat, marks=="A"),

+ correction"isotropic")

> KdemopatControls <- Kest(subset(demopat, marks=="B"),

+ correction"isotropic") where subset(demopat,marks==“A”) and subset(demopat,marks==“B”) provide, respectively, the ppp objects for the point pattern of type A (the cases) and the point pattern of type В (the controls). The option correction allows us to specify the method to correct for edge effects. The choice isotropic indicates, in particular, to use the Ripley (1976) method.

The output of Kest contains the vector iso of the values of the empirical ^-function and the vector r of the corresponding distance values d. Therefore, we can obtain D(d) = Kcases (d) - К controls (d) and the corresponding vector of d with:

> Demp <- KdemopatCases\$iso-KdemopatControls\$iso

> d <- KdemopatCases\$r

In order to compute the confidence envelopes under the null hypothesis of random labelling, we need to generate nsim randomly labelled versions of demopat in which the cases and controls labels are randomly permuted amongst the observed locations. If we run rlabel(demopat) we obtain a point pattern where the labels A and В are randomly permuted among the points of demopat. Therefore, for example, to obtain a list of 29 randomly labelled patterns of A and В points, we can type:

> nsim <-29

> RLsim

For each of the 29 simulated patterns we need to compute the D-function. To avoid the use of loops, we can again rely on the function lapply, which applies a given function to all the elements of a list. It is then convenient to first build a function to compute the D-function for a two types point pattern:

> DenvFunct <- function(pattern, idcases, idcontrols) {

+ Kcases <- Kest(subset(pattern, marks==idcases))

+ Kcontrois <- Kest(subset(pattern, marks==idcontrols))

+ D <- Kcases - Kcontrois + D\$iso + }

and next apply it to the list of the nsim simulated randomly labelled patterns with the aim of obtaining a matrix of the resulting D-function, that is:

> Denv <- lapply(RLsim, FUN=DenvFunct, idcases="A",

+ idcontrols="B")

> Denv <- matrix(unlist(Denv), ncol=nsim)

Each column of the matrix Denv contains the D-function values for each randomly labelled pattern. As a consequence, the maximum and minimum values across the rows of Denv provide, respectively, the values of the upper and lower confidence envelopes. These can be obtained with

> EnvUpper <- apply(Denv, 1, max)

> EnvLower <- apply(Denv, 1, min)

Finally, the graphical comparison between the empirical D-function and the corresponding upper and lower confidence envelopes under random labelling can then be made by typing:

> plot(d, Demp, type="n",

ylim=c(min(EnvLower),max(EnvUpper)),

+ xlab="d", ylab=expression(hat(D)))

> bandx <- c(d, rev(d))

> bandy <- c(EnvLower, rev(EnvUpper))

> polygon(bandx, bandy, border="grey", col="grey")

> lines(d, Demp)

Appendix 8.2: К,• n /„, -fu nction-bascd test of spatial interactions

To perform a test of spatial interactions based on the /]n|U)m-function we can use the R function envelope from the spatstat package introduced in Appendix 7. By way of illustration, let us see how to conduct the test for the artificial point pattern dataset simdat. If we opt to estimate the first-order intensity non-paramet- rically using Gaussian kernel smoothing, we can use the function density.ppp, which simply requires, among other possible options, to specify the ppp object of interest and, sigma, that is the bandwidth for the Gaussian kernel:

> library(spatstat)

> lambda <- density.ppp(simdat, sigma=l.5)

> plot(lambda)

Then, we have to create the object KinhomBands which, among other elements, contains the vectors of the empirical -Kjnliom-function computed on simdat and the upper and lower confidence envelopes for 39 realizations of an inhomogeneous Poisson process conditional on the same number of points of simdat and first-order intensity values provided by lambda:

> KinhomBands <- envelope(simdat, fun=Kinhom, nsim=39,

+ simulate=expression(rpoint(n=simdat\$n, f=lambda)), sigma=l.5)

With the option fun=Kinhom we have specified that we want to compute the simulated confidence envelopes using the fnhom-function as a summary function; the option nsim indicates, as usual, the number of simulations to perform; and simulate allows us to choose the kind of simulation we need. The function rpoint generates a random point pattern containing n independent, identically distributed random points according to a given first-order intensity. With the syntax expression(rpoint(n=simdat\$n, f=lambda)) we can simulate inhomogeneous Poisson patterns with a fixed number of points equal to the size of simdat and first-order intensity values provided by lambda.

Finally, the graphical comparison between the empirical Ljni,om-fimction with the corresponding upper and lower confidence envelopes can be made through the function plot.envelope, that is:

> plot(KinhomBands, fmla=sqrt(./pi) ~ r)

Alternatively, in order to perform the .Kjnhom-function-based test with the first-order intensity function estimated by a parametric regression model, the function ppm can be used. For example, to estimate A(.v) with a model that assumes that the intensity depends on a quadratic trend in the spatial coordinates, we may run:

> fit <- ppm(simdat, ~ polynom(x,y,2) , PoissonO)

> lambda <- predict(fit)

## Appendix 8.3: Duranton–Overman K-density and Marcon–Puech M-function

To implement the Duranton-Overman JC-density and Marcon-Puech M-function measures of spatial concentration of economic activities in R, the library (dbmss) (Marcon et al., 2015) provides, respectively, the functions KdEnvelope and MEnvelope. Let us see how to use them while assuming that we are interested in assessing the level of relative spatial concentration of points of type A in the spatstat example dataset demopat. Both the functions require that the point pattern under study is in the form of a on R-object of class “wmppp”. We can create a wmppp object using the function wmppp that takes as inputs a dataframe - containing the point coordinates and the potential associated typologies and weights - and the polygon defining the study area. Therefore, to convert demopat into a wmpp object we can type:

> demopat2 <- wmppp(data.frame(X=demopat\$x, Y=demopat\$y,

+ PointType=demopat\$marks), window=demopat\$win)

and then, to perform the analysis based on the /^-density and M-function we can run, respectively:

> Kd <- KdEnvelope(demopat2, NumberOfSimulations=39, Alpha =

0.01,

+ ReferenceType="A")

> plot(Kd)

and

> M <- MEnvelope(demopat2, NumberOfSimulations=39, Alpha =

0.01,

+ ReferenceType="A")

> plot(M) where, quite intuitively, the options NumberOfSimulations, Alpha and Refer- enceType refer, respectively, to the number of simulations for the confidence envelopes, the level of significance and typology of points for which we want to analyze the relative spatial concentration.