ECG biometrics being a pattern recognition problem comprises two blocks, a feature extractor and a classifier. The performance of the fiducial feature extraction approaches is largely dependent on the accuracy of the QRS detection algorithms as these techniques require segmentation of ECG wave and detection of various sub waves, i.e., P-wave, QRS complex, T-wave, etc. Therefore, for feature extraction, here a non-fiducial approach, as suggested by N. K. Plataniotis, has been employed. A number of classifiers have been reported in the literature. Based upon earlier studies carried by the authors , for the classification task, MLP, RBF, and SVM have been used. A brief review of the feature extraction technique and the classifiers is given in the following sub-sections.
The approach suggested by N. K. Plataniotis and his colleagues is also known as the AC/DCT method as it exploits the ability of autocorrelation to extract the selfsimilarity in a given data sequence. The approach begins with pre-processing of the signal to remove noises in the ECG signal like the baseline wander, power line interference, electrode contact noise, etc. For this work, pre-processing has been carried out using fourth-order Butterworth band pass filter with a cutoff frequencies l and 40 Hz. This pre-processed ECG signal is windowed with only constraint that the windowed signal should contain at least two complete cardiac cycles; so the number of samples are accordingly chosen based on the sampling rate. This is followed by autocorrelation of the resulting sequence and its normalisation by dividing the autocorrelation with the maximum value of the autocorrelation coefficients obtained. In order to carry out the dimension reduction, utilising the energy compaction ability of DCT, DCT of the normalised autocorrelation coefficients is obtained. The main steps involved in the implementation of this technique are mentioned below:
a. Pre-process raw ECG signal to remove noise and segment it into nonoverlapping windows.
b. Calculate the autocorrelation Ru(m) of the windowed ECG signal x(i) and obtain the normalised autocorrelation coefficients by using the following expression:
where m is the time lag with values ranging from 0 to M-l and M being very-very less in comparison to N, the length of the windowed signal.
c. Significant coefficients from the normalised autocorrelation coefficients are obtained by applying DCT; with many DCT coefficients having value zero or near to zero.
d. First C coefficients are retained to form the feature vector of a given subject.
The three values to be chosen are, the interval of ECG signal N, the value of M related to lag m, and the number of DCT coefficients to be used as the feature vector.
FIGURE 6.4 Comparison of AC/DCT features of two subjects.
Based on the results reported in the literature and further experimental evaluation, the values of these parameters were chosen as N = 10,000, corresponding to 10 seconds of signal, M = 180 and C = 13. The outputs of the various steps of the АС/ DCT method listed above are shown in Figure 6.4 for ECG signals of two subjects. The difference in feature vectors extracted from ECG signals of two subjects is evident from the inspection of Figure 6.4g and h.
Artificial Neural Networks and Support Vector Machines have been used for biometric recognition scheme explained in this chapter. An overview of these two classifiers is given below.
Artificial Neural Network (ANN)
Usually the requirement is that the system has the ability to learn from known samples of pattern and then adapt itself to take decision for unseen patterns; somewhat similar to what humans go on to do. In order to replicate the human learning capability many techniques have been put forward, out of these ANNs tries to computationally model the fundamental building block of the brain, i.e., a neuron.
ANN is an interconnection of neurons in layered manner with the output of a neuron dependent on the input weighted by connecting weights and non-linearly mapped by transfer function or activation function. Neuron layers between input and output layers are known as hidden layers. The manner in which the various neurons are arranged is known as architecture of the neural network. The architectures can be broadly classified as single layer feedforward networks, multilayer feedforward networks, and recurrent networks. Some of the popular networks reported in the literature are MLP, RBF, learning vector quantization (LVQ), self-organising map (SOM), and Hopfield neural network. On the other hand, the algorithm used to compute the weights and other parameters is known as the learning rule which can be supervised or unsupervised. The training of the MLP network is carried out by using the back- propagation algorithm which is a supervised learning scheme. In this algorithm, the weights and biases of the different layers are updated by propagating in the backward direction of the sensitivities obtained after feeding the feature vector at the input of the network. Among the various variations of backpropagation, the Levenberg Marquardt algorithm is most popular .
The development of neural networks followed a heuristic path, with theory developing later on. On the other hand, an approach which is based on the sound theoretical background and has become popular among pattern recognition community is SVM.
Support Vector Machine (SVM)
SVM proposed by Vladimir Naumovich Vapnik and Alexey Yakovlevich Chervonenkis has emerged as a powerful tool for binary classification. It separates the two classes by constructing an optimal separating hyperplane and ensures that the margin between two classes is maximised. “Support Vectors” are the bounds between datasets and the optimal separating hyperplane. The objective of support vector is to maximise the distance or the margin between the support vectors. In fields such as handwritten digit recognition, text categorisation, and information retrieval, SVMs hold records in performance benchmarks .
For any V-dimensional feature, vector f] can be considered as a point in the V-dimensional plane belonging to a class q e [-1,1]. The optimal separating hyperplane, in the case of linear classification is obtained for the two classes in the following manner:
The objective here is that d = —, i.e., the distance between the support vectors is
maximised. For this, a Lagrange function is formulated and solved for minimisation of w and b. Usually linear classification is not always attained. Therefore, the input vector is mapped to higher dimension using a kernel function. A wide variety of kernel functions have been proposed by the researchers like polynomial, RBF kernel. In the experiments discussed in next section, some of these kernels have been used for the classification task.