Social Selection, Dyadic Covariates, and Geospatial Effects
Garry Robins and Galina Daraganova
Individual, Dyadic, and Other Attributes
In this chapter, we introduce models that include effects for actor attributes and dyadic covariates. Actor attributes are individual-level measures on the nodes of the network. Social selection models examine whether attribute-related processes affect network ties (e.g., homophily processes whereby network ties tend to occur between individuals with similar actor attributes) (McPherson, Smith-Lovin, & Cook, 2001). A dyadic covariate, in contrast, is a measure on each dyad, that is, on a pair of actors, and may similarly affect the presence of a tie. For instance, in a study of a trust network within an organization, the formal organizational hierarchy might partly shape the formation of trust ties. In that case, inclusion of the hierarchy as a dyadic covariate permits inferences about whether trust ties tend to align with hierarchical relationships (e.g., Tom is the boss of Fred). A binary dyadic covariate can be used to represent whether people share the same attribute or membership - that is, work at the same place, live in the same household, or attend the same church. Continuous dyadic covariates are also possible. Although spatial embedding of networks, to an extent, can be captured by dyadic continuous covariates, geospatial effects are a distinctive feature, so we provide a separate section in this chapter.
The preceding chapters outline the general ERGM methodology but concentrate exclusively on models for endogenous tie-based effects. The presence or absence of individual ties is affected by a surrounding neighborhood of other ties, with that neighborhood determined by the prevailing dependence assumption. These endogenous effects represent processes of network self-organization.
The inclusion of nodal attributes in a model can be seen as a relaxation of the assumption that endogenous effects are homogeneous across all nodes. However, we prefer to consider attribute effects as indicative of exogenous processes that operate alongside endogenous self-organizing mechanisms. An exogenous effect is assumed to be fixed and hence external to the self-organizing system. For instance, an attribute such as gender can be considered fixed for each actor in the network, although it still varies across actors. The variable gender is then exogenous in that it may affect the structure of network ties, but not vice versa (i.e., we may not expect the presence of a network tie to affect an actor’s gender). Similarly, the formal organizational hierarchy may be an exogenous dyadic covariate for a network of organizational trust. It is variable in that for each pair of actors, there may or may not be a hierarchical relationship, so that a second network could be defined with ties present when i is the boss of j. This second network of formal hierarchical ties may affect the network of trust (e.g., there may be a tendency for people to trust their bosses), but the hierarchical network boss of is exogenous (fixed). Whatever network structure there is to the social system of trust, it does not affect the formal organizational hierarchy.
Actor attributes can be any relevant measures on the nodes. They may include attributes that are genuinely nonchangeable (e.g., birthplace), or attributes that change so slowly or so rarely that in the context of a study they may be taken as fixed in practice (e.g., age in cross-sectional studies). In these cases, there can be little dispute that the attributes should be treated as exogenous. So, the network research question is whether these attributes affect the presence or absence of network ties. This is a question of “social selection”: actors may select one another as network partners, depending on the attributes that they have. Akin to a regression, the attributes are predictors of network ties.
However, actor attributes also include variables such as attitudes or behaviors that are liable to change. We can also treat these as exogenous in an ERGM social selection model, but in doing so we are making certain implicit assumptions, principally that the attribute is not changed by the network ties. Of course, this may not always be realistic. Along with social selection, there may also be processes of network “social influence,” whereby the presence of a tie may alter an attribute. For instance, people may be influenced by their network partners to change their opinions or behaviors. We cannot readily distinguish influence and selection effects in cross-sectional data (in our next chapter, we discuss this point further when presenting models for social influence). So, in ERGM social selection models, if the attributes are possibly changeable, we are still treating them as predictors of network ties but - again, analogous to a cross-sectional regression - we need to be careful about our inferences. If we see a significant attribute effect, we have evidence for an association between attributes and network ties, but we cannot make confident causal inferences. We do not know whether the attribute leads to the tie, or vice versa, so we cannot be sure whether the observed effect is one of selection or influence. If we want to distinguish selection from influence, we need to collect longitudinal data and use other methods (as discussed in Chapter 11).
In many circumstances, this is not overly problematic. For instance, in a hypothesis testing framework, if the theoretical hypothesis is about homophily, then the use of a social selection ERGM makes sense to examine evidence for the hypothesis. In other instances, inference about an association between ties and attributes, not the causality, may be sufficient for a given study. If the focus of a study is on network structure and not the attributes, it can make sense to treat attributes as predictors, akin to control variables that enable principled inferences about endogenous network structure. These types of decisions are familiar from standard regression analyses, where the associations between predictor and outcome variables are established, but not the direction of causality.
The issue of controlling variables is important. Sometimes we want to make inferences about attribute processes such as homophily. We still need to include endogenous tie-based effects in our model to cater for the dependencies within the data so that we can make sound inferences. Thus, a model with both exogenous attribute effects and endogenous tie effects can be viewed in two ways: if the focus is on the attribute selection effects, the endogenous network tie processes are controlled, and if the focus is on the network structure, the attribute variables operate as controls on selection effects.
Because social selection models investigate associations between ties and attributes, we often refer to attribute-based effects as “actor-relation effects” to emphasize this aspect of association between the two types of variable.
For notation, we continue to use Xij to denote a binary network tie- variable. We denote an attribute variable on node i as Yi and a dyadic covariate between nodes i and j as Wij. In this chapter, we treat attribute variables and dyadic covariates as binary or continuous, although we make some comments on categorical attributes. The attribute and covariate effects described below can all be fitted using the PNet estimation software.