Autologistic Actor Attribute Models
Galina Daraganova and Garry Robins
Social Influence Models
So far, we focused on how a particular network structure may be a product of endogenous network processes (clustering, transitivity, popularity, etc.) and exogenous nodal and dyadic factors (gender, membership, geography, etc.). This chapter presents a class of cross-sectional network models that, rather than modeling network structure, allows us to understand how individual behavior may be constrained by position in a social network and by behavior of other actors in the network. For this purpose, we take network ties to be exogenous and model behaviors of the actors. We use the term “behavior” to refer to whatever nodal attribute we are interested in modeling, but this is understood to also cover, for example, attitudes and beliefs. The behavior is assumed to represent states and, at least in principle, may be liable to change, and possibly to change several times. However, the network ties are treated as exogenous and not changed by the attributes. In this chapter, we deal with binary attribute variables as measures of behavior, and if the variable is 1, we say that the actor displays the behavior, or that the behavior is present for that actor.
Social networks are often important to understand because social processes - such as diffusion of information, exercise of influence, and spread of disease - may be potentiated by network ties. There are relatively few models available for assessing the nature of this association between individual outcomes and network structure. An early instance of a general network approach to model social influence processes originates in network autocorrelation models (Doreian, 1982, 1989, 1990; Doreian, Teuter, & Wang, 1984; Erbring & Young, 1979; Leenders, 2002), based on work in spatial statistics (Anselin, 1982, 1984; Cliff & Ord, 1973, 1981; Ord, 1975). In this approach, network ties are taken to reflect dependencies among individual variables. An explicitly dynamic but deterministic theory of network-mediated social influence was developed by Friedkin (1998), who termed it the structural theory of social influence. This theory has its roots in the work of social psychologists and mathematicians, including DeGroot (1974), Erbring and Young (1979), French (1956), Friedkin and Johnsen (1997), Harary (1959), and others. Friedkin described it as “a mathematical formalization of the process of interpersonal influence that occurs in groups, affects persons’ attitudes and opinions on issues, and produces interpersonal agreement, including group consensus, from an initial state of disagreement” (2003, 89). In particular, the structural theory of social influence describes a process in which a group of actors weigh and integrate the conflicting influences of significant others within the context of social structural constraints. Within this tradition, Valente (1995) explicitly modeled the diffusion of innovations across social structures. Modeling the status of the individual as a function of both individual and structural characteristics of the network is also an important approach to understanding the spread of infectious disease (contagion) in network epidemiology (Meyers, 2007; Sander et al., 2002).
These models were developed for social processes, such as influence, contagion, and diffusion, in which a network tie between two actors entails interdependent actor attributes. In other words, the network structure is used to help explain the distribution of attributes/behaviors. The models have enabled important empirical research on the diffusion processes in social networks in a variety of social science fields, including studies by Davies and Kandel (1981) and Epstein (1983) on the peer effect on educational decisions, the study by Gould (1991) on the mobilization in the Paris Commune, and the study by Burt and Doreian (1982) on the perceptions of the significance of journals by sociologists. Influence processes have also generated interest in economics, where they have gone by the general name of “peer effects” (Durlauf, 2001; Jackson, 2008; Man- ski, 1993). This term has been used to cover a variety of different forms of influence processes and has lately come to include network-mediated influence. Although these models have added substantially to the ability of researchers to understand the nature of the relationship between characteristic of individuals and their social networks, we propose to use the ERGM framework because it offers considerable flexibility in formulating models to examine different types of dependencies among variables.
9.2 Extending ERGMs to Distribution of Actor Attributes
Following the logic of ERGMs and the principled use of the dependence graph, Robins, Pattison, and Elliott (2001) proposed a class of autologistic actor attribute models that they called “social influence models”
104 Exponential Random Graph Models for Social Networks
to distinguish them from social selection models. These models focus on modeling a distribution of attributes across a fixed network of relational ties. They do not attempt to model processes of interpersonal influence explicitly; rather, they are intended to investigate the extent to which a pattern of social relations among individuals (i.e., an individual’s structural position and social proximity to others) may be associated with shared opinions and/or similar behavior. As such, these models allow insight into the consequences of diffusion.
In this class of models, an “attribute” of interest is regarded as a dependent stochastic variable measured at the level of an individual, and a network tie-variable is regarded as an independent fixed variable measured at the level of the dyad. The starting point for model development is the idea that the attribute of one individual is potentially dependent on and may potentially influence the attributes of others (Durlauf, 2001).
Let the collection of random variables Y = [Yi] be a stochastic binary attribute vector where i = 1,..., n. This is the dependent attribute of interest. The space of all possible attribute vectors is denoted by Y. A realization of the stochastic attribute vector Yis denoted by y = [yi], where yi = 1 if the attribute is present, and yi = 0 otherwise. As a reminder, a realization refers to an observed vector of attributes. As in previous chapters, let a collection of network tie-variables be represented by a fixed binary matrix, where xij = 1 if tie is present, and xij = 0 otherwise. There may also be other covariate (predictor) attributes W denoted by w = [wi], which may be either binary or continuous.
With the network ties treated as exogenous (i.e., explanatory), network-based social influence effects may be inferred when i’s attribute is associated with the attributes of the actors who may have social relations with i through the network ties. That is, we assume that the probability of an attribute being present depends on the presence of the attributes in some local network neighborhood of the actor. It is also possible that i may adopt a behavior solely on the basis of i’s position in the network, such as greater popularity or activity, or because of other attributes of i. These possibilities need to be built into the model.
Generalizing the ERGM approach, we can specify a probability for observing the attribute for each possible observation:
where в I and zI are parameters and statistics for network-attribute configurations involving an interaction of dependent attribute (y), network (x), and covariate (w) variables. Examples of configurations are given in Tables 9.1 to 9.3 later in this chapter.
Equation (9.1) describes a probability distribution of vectors on n nodes in a given graph x. Each possible vector is assigned a specific probability based on the relative number of various configurations present and on the parameter values. When a parameter is large and positive, vectors with many corresponding configurations are more likely to be observed; conversely, for a large negative parameter, such vectors are less probable. The proposed models predict the outcome variable Y while taking the network dependencies between observations into account in a principled way that cannot be addressed in the standard logistic regression. It is worth emphasizing that when the dependence among attributes via network ties is not assumed, the autologistic actor attribute model (ALAAM) is equivalent to standard logistic regression.
The comparison with logistic regression becomes more apparent if we consider the conditional form of the model. We noted in Chapter 6 that ERGMs can be expressed in either a joint form, as in Equation (9.1), or in conditional form as a conditional log-odds. If we take the conditional form of Equation (9.1), and separating out the configurations based on different types as explained here, we have
Here, в 1 is an intercept term and is analogous to the edge parameter in a standard ERGM. The вP parameters (P for “position”) predict Yi from the network position of i, and the statistics zP relate to network configurations that involve node i. We term these “network position” effects, and examples are given in Table 9.1. The вI parameters (I for “influence”) predict Yi from the Y attribute of other actors j who are connected to i in some way. The statistics zi relate to network configurations that involve nodes i and j, and their network connections. We term these “network attribute” effects, and examples are shown in Table 9.2. The вc parameters (C for “covariate”) predict Yi from covariate attributes of i in some way (top line of Table 9.3). The вIc parameters predict Yi from other covariates of other actors j who are connected to i in some way. The statistics zIC relate to network configurations that involve nodes i and j, and their network connections. We term these “covariate” effects, and examples are provided in Table 9.3.
If there is no network diffusion in the system (i.e., if the network is irrelevant), then вP = вI = вIc = 0, and the model reverts to a standard logistic regression, with the parameters в 1 and в c as the standard logistic regression coefficients. However, when some form of network effect is present, this is quite different from a logistic regression. In that case,
when predicting i’s attribute status, the attribute variable Yi is not only considered as a “response” variable, but it is also a predictor variable when predicting j’s attribute status. This directly implies the autologistic nature of this class of models. Using a logistic regression will lead to biases in estimation and the risk of improper statistical inference.
Autologistic actor attribute models (ALAAMs) differ from exponential random graph models for social networks in the following way. ERGMs express interdependent tie-variables Xij as a function of endogenous tie-variables and exogenous variables (e.g., attribute variables Yi or spatial (dyadic) variables), whereas autologistic actor attribute models express interdependent actor attributes Yi as a function of exogenous tie-variables Xij (as well as, in principle, other exogenous attribute variables and possibly spatial covariates). In other words, ERGMs model ties, given the attributes, and ALAAMs model an attribute, given the ties (and other attributes). Although these models are different in explanatory and response variables, both are models for a class of mutually interdependent variables that may also depend on another class of exogenous variables. An important step for both models is to specify the dependence assumptions appropriately because the proposed dependencies determine the form of the configurations parameterized in the model.