# Feature Selection

Feature selection, also referred as subset selection, is used to select an effective, but reduced, set of syndromes for use by a diagnosis system. The main idea of methods for handling missing syndromes is to add statistical information to the incomplete data set to compensate for losses caused by missing syndromes. However, the additional information may also introduce irrelevant, redundant, or even misleading syndromes, lowering the diagnosis accuracy of the original system.

Therefore, the goal of feature selection is to identify a set of most important features from incomplete data for characterization. One of the most popular solutions for the subset-selection problem is based on the metric of minimum-redundancy- maximum-relevance *(mRMR).* Details of *mRMR* method was described in Chap. 5. We adapt *mRMR* subset selection process for handling missing syndromes. Two approaches are discussed below.

## Complete-Case Analysis

The first method that we use to address missing syndromes in *mRMR* subset selection is the complete-case analysis as described in Sect. 6.2.2.1. Suppose we have a set of successfully repaired faulty boards with root cause set **A **= {A_{b} A_{2},*A _{N}}* and syndrome set

**T**=

*{T, T*

_{2},..., T_{M}}. We can compute the desired posterior occurrence probability of root cause

*Aj*,

*p( Aj*|T). Then, the feature selection approach can calculate the relevance values, redundancy values, and

*mRMR*values, using this set of posterior. The final

*mRMR*syndrome subset can then be determined by selecting syndromes with largest

*mRMR*values.