There are models dedicated for predicting binary events, such as default or non-default, or for scaling the probabilities that such events will occur. These models include the linear probability models and the more adequate Logit and Probit models. In what follows, "individual" means individual observation, an observation relating to any type of borrowers, consumers or corporates. These models use the multivariate regression technique. The explanatory variables are the attributes X.. These techniques also apply to model a categorical dependent variable from observable attributes, and assigning individuals to several categories based on observable attributes. In that case, they serve to define a typology of clients under a commercial view, rather than a risk view.
The "Basic" Linear Model Drawbacks
The simple linear probability model illustrates the principle. The linear probability model makes the probability P of the event a linear function of several attributes X.. The purpose is to relate the Bernoulli variable, default or no default, 7, taking values 0 for non-default or 1 for default, using the observable attributes. The essentials are easy to explain with a single attribute, X, but would be the same using several observable characteristics. The model is:
Fis either 0 or 1. Taking the expectation of F, and using a zero expectation for the error term e of the regression:
The model provides the value of the probability of Y equal to 1, or the default probability. All observed values of Y within the sample are either 0 or 1. But the linear regression provides coefficients such that F can take values which are not necessarily within the 0 to 1 range. This implies truncation to avoid such outliers. The Logit models avoid this drawback.