Propensity Score Adjustment for Observational Studies
Observational studies attempt to estimate the treatment effects by comparing outcomes for subjects who are not randomly assigned to treatment and control groups. In the modern era of big data, the sources for observational data are abundant, including administrative records, registries, electronic health records, claim databases, surveys, and so forth. Without a random assignment mechanism, some subjects are more likely than others to receive the treatment due to differences in individual characteristics (e.g., age, gender, disease severity), in observational studies. Therefore, careful statistical adjustments are needed to justify the causal interpretation based on the analysis of observational data. It is clear that the assignment mechanism is broken in observational studies due to the lack of randomization, so it is natural to devise tools to fix this broken link. The propensity score, defined as the conditional probability of exposure to a treatment given observed covariates, is a technical tool that addresses the treatment assignment problem. It can balance the observed covariate distributions between treatment and nontreatment groups and thus approximate a randomization-like scenario in terms of treatment assignment. Thus, propensity score methods can reduce the covariates-induced bias in treatment effect estimation (Rosenbaum and Rubin 1983a).
Formally, denoting T as a dichotomous treatment indicator (1 for being treated and 0 for being untreated) and X as the vector of observed covariates, we may define the propensity score e(X) as:
Propensity scores are typically used in a nonparametric form to remove the observed confounding bias, such as, constructing matched pairs, serving as selection weights, or creating homogeneous strata. If the structural relationship between the outcome and some covariates is known, propensity scores can be combined with regression adjustment to improve the estimation efficiency. The rest of this subsection provides more details about different ways of using propensity scores analytically.