Bias may be embedded within the collection of predictive factors scored by the algorithm. Crime is a complex issue. Faced with a multifaceted, real-world scenario in which crimes occur, tool developers must select from a multitude of possible predictors. Despite how many dozens of factors the final algorithm scores, it cannot possibly present a complete picture of the circumstances in which crimes occur. The choice of predictors, known as feature selection, is thereby an inherently reductionist exercise (Favaretto, De Clercq and Eiger, 2019). Simply put, every tool oversimplifies crime (Veale, 2019). The point is that developers introduce bias by choosing which factors to test in the first place, and again by narrowing to a smaller number of predictors to incorporate into their final algorithms (Favaretto et al., 2019).
Omitted variable bias occurs when an instrument excludes a variable that is correlated with both an existing predictor and the outcome (here, offending). The fact that tools as they exist today contain relatively few variables (and most of which depend on criminal history measures) means that a plethora of relevant data is ignored. As an illustration, it may be that prior arrests for violence as a predictor of future violent offending are both correlated with the neighbourhood of residence. Flence, if the model excludes home locale, then it would exemplify omitted variable bias, thereby weakening the algorithm’s predictive ability. The situation only enhances the probability of significant numbers of false positives (wrongly predicting individuals who will offend) and/or false negatives (failing to predict who will offend).
Ideally, the sample on which an algorithm is trained is sufficiently representative of the population on which the tool will be used in a real-world setting. In criminal justice, this often means that the sample should reflect roughly equivalent percentages of the larger population on sociodemographic factors, such as race, ethnicity, gender, age and class. A failure of representativeness exemplifies sample bias. Because of the general lack of transparency in predictive policing tools, it is not known how representative the testing dataset may be. On the one hand, unlike the experience with prediction tools in other criminal justice decision points, which are often based on external sample data (Hamilton, 2019a), predictive policing tools often appear to be learned on local datasets (Liberty, 2019). This suggests the potential for bias presenting through nonrepresentative data is not as strong. On the other hand, the evidence suggests that the predictive policing models have mostly been normed on individuals who were already known by police because they were represented in official criminal justice records. As individuals with previous police contacts may differ in risk-relevant ways from those not already known to police, the resulting algorithm will better predict offending in the former group versus the latter.
Sample bias is also likely to exist to some degree because model developers do not generally comply with some best practice standards in empirical research. Typically, tool development does not use independent, random samples. Instead, test samples are often dependent samples (e.g. individuals arrested by the same police force) and convenience samples (the data were accessible).
Risk algorithms learn on historical data. The training data may incorporate information reflecting (and reifying) pre-existing discriminatory decisions (Joh, 2018). A significant source of bias derives from a disproportionately heavy reliance upon criminal history measures as predictors (recall such predictors as prior arrests, prior gun violence). Clearly, criminal history information in the historical data may represent discriminatory practices by victims, police and prosecutors based on sociodemographic characteristics (e.g. race/ethnic affiliation, gender, immigration status). For example, prosecutorial policies that impose a disproportionate burden on minorities may also introduce bias into the training data. Initiatives such as no drop policies to deter a particular problem (e.g. gun violence, knife crimes, street-level drug dealing) may increase conviction rates for minorities to a greater degree than non-minorities. If unchecked, the resulting algorithms would thereby learn that such sociodemographic traits are predictive of offending.
Equally important, the mere recitation by tool developers that their models do not include race or gender is rather misleading. Even where those factors are not explicitly listed, the algorithms will learn on factors that are proxies to sociodemographic characteristics. Recall that some of the predictive policing tools incorporate events relating to involvement in gun violence (as victims or perpetrators).To the extent that minorities and men are more likely to have been victims or perpetrators of gun violence, that factor (involvement in gun violence) will serve as a proxy to minority race and male gender.
Another issue with gender should be highlighted here. Offender-based risk tools have tended to learn on convenience samples of individuals known to police as arrestees or considered to be potential future criminals. Males are simply arrested or identified as suspects far more often than females across jurisdictions and time frames. As a result, a significant majority of the convenience samples used are learned on males. Risk factors therein thus are often more relevant to males, such that meaningful risk factors that are more culturally sensitive to female populations may be omitted. A risk assessment process that presumes that risk tools are somehow universal, generic or culturally neutral will result in misestimation. Importantly, research indicates that a womans likelihood of offending is impacted to a greater extent by such experiences as parental stress, personal relationship problems, prior and effects of trauma (Hamilton, 2019c). Women are also highly likely to be influenced to commit crimes by others, often their (male) intimate partners. Failure to include such gender-sensitive, risk-relevant attributes will mean the tool will perform more weakly for females.
Biases, once embedded in an algorithm, can become further entrenched. Algorithms may suffer from a feedback loop in which biases are amplified over time. Biased predictions create additional inequalities from which the algorithm learns and then skews future predictions even more (New and Castro, 2018). As an example, where a jurisdiction uses a biased algorithm, higher risk predictions may mean that minorities are arrested more often, and thus are seen as more dangerous, thereby magnifying the likelihood of minority arrests in the future.
Bias may be exacerbated when the training data are produced by the same actors as those who will use those predictive tools (Veale, 2019). A predictable scenario is when police target a certain neighbourhood such that the arrest rates of area residents thereby increase. This arrest data is then used to inform an algorithm. In turn, the algorithm predicts higher risk of recidivism for this neighbourhood s residents, leading to a high rate rearrests, and thereby entrenching overpolicing practices. The results can thereby become circular. The algorithms prediction of an individual’s being subject to overpolicing is itself predicted by a past history of people like him being overpoliced (Hamilton, 2019b).
These issues of bias are particularly problematic as little is known about the extent to which they exist. Observers contend that the predictive policing scheme is too secretive, with tool developers and users failing to acknowledge potential avenues for bias or inaccuracies, or how they might (if at all) remediate them (Richardson, Schultz and Crawford, 2019). Despite these issues, there are many reasons to find promise in predictive policing tools in identifying individuals who are at high risk of committing a future crime.