Classic exploratory tools and summary statistics for spatial point patterns
Traditional techniques for performing preliminary analysis of spatial point patterns consist essentially of graphical exploratory tools and formal tests for the hypothesis of complete spatial randomness. They can be subdivided into two general classes of methods: (i) quadrat-based methods and (ii) distance-based methods.
The quadrat-based approaches to the preliminary analysis of a spatial point pattern of interest, observed in a rectangular study region A, involve the partition of A into contiguous rectangular sub-regions of the same area called “quadrats”. The frequencies, or counts, of points falling in the quadrats can be used to compute useful statistics or to perform tests. To begin with, if we suspect that the intensity of the underlying point process is not constant over A, we can obtain a simple estimate of the first-order intensity function by dividing the quadrat counts by the areas of their corresponding quadrats (Baddeley et al., 2015). See Figure 6.15 for an illustrative example. Let us suppose that the left panel displays a spatial point pattern of business units in a 100-kilometer-by-100-kilometer region. In order to estimate the varying intensity function of this pattern, first of all, the region is partitioned into quadrats forming a 5-by-5 grid of rectangles with an area of 400 square kilometers each; secondly, the average intensity, that is the ratio between counts and area, in each quadrat is computed. The central panel of Figure 6.15 displays the partition into quadrats and reports the corresponding intensity estimates. Since, for example, the upper-left quadrat hosts nine business units in an area of 400 square kilometers, its average intensity is 9 / 400 = 0.022 business units per square kilometer. The pixel image of the quadrat intensities displayed in the right panel of Figure 6.15 represents an estimate of the first-order intensity function. Figure 6.15 shows that the intensity is not constant across the quadrats, as it varies from a minimum of 0.005 to a maximum of 0.038 units per square kilometer.
Strong differences among quadrat intensities provide evidence against the CSR hypothesis and hence against the assumption that the data generating process is a homogeneous Poisson point process. A Pearson’s chi-squared test based on quadrat counts can be used to assess if the observed differences in intensity are strong enough to reject the (null) CSR hypothesis. Let us consider the partition of A into m quadrats of equal area a, and the let щ,П2,...,п;,...,пт be the observed quadrat counts. If the CSR hypothesis is true then each «, is a realization of an independent Poisson random variable with mean Xa, where A is the (unknown) first-order intensity corresponding to the expected number of points per unitary area. Considering that the natural estimate of A is A = n / (та), where n is the total number of points observed in A and та gives the total area
Figure 6.15 Quadrat-based estimation of intensity for a hypothetical pattern of business units: (a) point pattern; (b) quadrat average intensities (business units per square kilometer); (c) estimated intensity function.
of A, the expected count for quadrat n, under CSR is ej = Xa = n / m. Therefore, the proper chi-square statistic is
Provided that n / m is greater than 5, if the CSR hypothesis is true then yfi follows approximately a Xm-l distribution. Consequently, significantly great or small values of yfi indicate that the observed point pattern in A tends to be, respectively, more aggregated or more regular than a CSR pattern.
For the point pattern of business units depicted in Figure 6.15, yfi =23.97 with a two-sided /(-value = 0.927 which imply that the CSR hypothesis cannot be rejected, and hence that the observed differences amongst the quadrat average intensities are likely due to chance.
The results of quadrat-based methods depend strongly on the size of the quadrats and hence on the partition of A. Unfortunately, the choice of the partitioning scheme is usually arbitrary and an optimal criterion to guide this choice is not available. In light of this, Grieg-Smith (1952) proposed an approach to verifying the robustness of results with respect to the size of quadrats. The approach is based on the use of different alternative partitions of A characterized by differing quadrats’ size. In particular, Greig-Smith (1952) suggests starting with a given grid of quadrats, such as a 32-by-32 one, and then obtain a series of other less granular grids by progressively aggregating the adjacent quadrats into 2-by-2, 4-by-4, 8-by-8 and so on, blocks. For each grid, it is convenient to compute the “index of dispersion” of the quadrat counts, which corresponds to the ratio between the sample variance and the sample mean of quadrat counts. If we let n = n / m indicating the sample mean of quadrat counts, the sample variance can be computed as:
and hence the index of dispersion is given by:
As already discussed, under CSR the quadrat counts are independent realizations of the same Poisson random variable. Since the mean and variance of a Poisson random variable are the same, the index of dispersion of a CSR pattern should be approximately equal to 1. Therefore, plotting the values of the index of dispersion for the different grids against the corresponding block size allows us to assess how the results of the quadrat-based methods are affected by the spatial scale, that is by the way the study area is partitioned. Figure 6.16 shows a plot of the index of dispersion versus block size (k-by-k) for the point pattern of
Figure 6.16 Behavior of the index of dispersion with respect to the block size (k-by-k) for the artificial data depicted in Figure 6.15.
business units depicted in Figure 6.15. The values of the index observed for each block size fluctuate around 1 thus providing evidence that the observed pattern is consistent with the CSR hypothesis at any spatial scale.