A large US spatial network

As an example of a large spatial network, we examined the network defined for all US counties in the Continental USA.[1] We had two motivations for considering these data. One was substantive while the other was methodological. The Continental USA has 3111 counties. Pairs of counties are linked through sharing a common border. This adjacency in geographical space defines an unambiguous spatial relation linking counties. The Continental USA is divided also into 48 states each made up of counties. Each state has its 'own' history. In these histories, events and outcomes are described frequently as being unique to the 'proud' history of each state. Yet, on the ground, the boundaries between many pairs of states are evident only by signs marking them.[2] Certainly, social processes operate across the boundaries between these large aggregates. Attempts to understand these broad social processes need to move beyond state boundaries.

There have been two broad approaches to characterizing the spatial distribution of the large social, economic, and political diversity within the USA. One attempts to map broad contiguous areas of the landscape within which greater homogeneity is thought to exist. Two examples of doing this are Garreau (1981) who defined and delineated Nine Nations covering the USA, Canada, Mexico, and the Caribbean Islands, and Woodard (2011) who argued for there being eleven such nations. Their general argument has appeal, with both authors assembling considerable qualitative evidence in support of their theses. While there a reasonable compromise between large heterogeneous areas like states and very small potentially more homogenous local areas for which systematic data do not exist.

A second broad approach is exemplified by Chinni and Gimpel (2010) who eschewed geography during their detailed data analysis. After assembling statistical data for counties, Chinni and Gimpel clustered them using these constructed variables. They then plotted these clusters of counties in geographical space to describe a 'patchwork' nation with very different patches distributed across the nation and within states.

It seemed reasonable to seek a middle ground between focusing solely in large contiguous regions and focusing solely on the attributes of the units (counties) located in geographical space. The general problem is one of clustering units based on measured variables while being attentive to relations among the units. Although it was not proposed initially for dealing with spatially distributed data, one method for doing this - clustering with relational constraints - was proposed by Ferligoj and Batagelj (1982, 1983). It clusters units based on a set of measured variables, consistent with the approach of Chinni and Gimpel (2010), while constraining cluster memberships according a relation linking the units being clustered. The obvious relation in the US context is the spatial adjacency of counties. However, the method, as initially formulated, is impractical for any large network, especially for one as large as this spatial network. The technical concern motivating our analysis was establishing a practical computational method for networks of this size while remaining faithful to the core conception of clustering with relational constraints. The newly developed algorithms and the results of applying them are described in Chapter 9.

  • [1] Hawaii and Alaska were excluded, for obvious reasons.
  • [2] Rivers are one of the exceptions when they form clear boundaries between states. Occasionally lakes do this. 10 The same argument can be made with regard to counties. However, as we claim in Chapter 9, counties represent
< Prev   CONTENTS   Next >