TWO APPROACHES TO CHOOSE NONLINEAR BOUNDARIES: DATA-GUIDED AND MULTIPLE SIMPLE UNITS
We will see two major strategies to nonlinear classification in the methods that follow. The first is to let the training data be a guide: we’ll choose nonlinear classification boundaries based on the data that we have in the training set. This will allow us to come up with nontrivial nonlinear boundaries without having to estimate large numbers of parameters or consider obscure mathematical models. The main disadvantage is that the classifiers we obtain will always be dependent on the specific sample of data we observed for training. There will be no estimation of model parameters that give us a biological intuition or statistical model for our problem. The second major strategy is to build the nonlinear classifier by cumulating the effects of several simple linear classifiers. This way we can control the complexity of the classifier by only adding on additional simple linear classifiers when they improve the classification. The major drawback of these methods is that there is typically no obvious order or starting point for this process of building up the classifier. This randomness or contingency in the training procedure must be dealt with, adding extra complication to the procedure.