Indexing on Biometric Databases
Authentication is essential to enforce access control. A natural way to recognise a person for authentication is through his physiological or behavioural characteristics. Physiological characteristics include face, iris, fingerprint, palm print, knuckle print, etc. that can be acquired from an individual’s body. However, behavioural characteristics are those that can be observed when a person accomplishes a specific task, such as talk, walk, and signs. All these characteristics are called biometric traits and are supposed to be unique to a person. Biometric traits are binding to a person and, therefore, are always available with the user and are difficult to steal. Figure 11.1 shows images of some of the popular biometric traits. To build an automated authentication system for this recognition approach, one has to choose a suitable trait having specific properties such as universality, uniqueness, permanence,
FIGURE 11.1 Samples of some popular biometric traits such as (a) face, (b) fingerprint, (c) hand geometry, (d) knuckle, (e) iris, (f) ear, (g) voice, (h) palm print, (i) signature, and (j) gait.
collectability, acceptability, and circumvention . Getting all these properties fully satisfied in a single trait is difficult because physiological characteristics depend on biological tissue that is prone to aging. Behavioural characteristics can be influenced by the environment and emotions. Decision on the choice of suitable biometric trait depends on the application for which the recognition system is to be deployed. If the authentication is to be deployed for a high-security area such as opening a currency chest, then highly accurate biometric trait, such as iris, could be deployed even when the trait has lesser user acceptance. It should be noted that providing iris sample is inconvenient to the user so its acceptance is low. When the system is to be deployed to less-security area such as entrance to a shopping mall then we can use traits that have high social acceptance such as face even when its accuracy is a bit low.
Authentication has two aspects. Either we wish to determine whether the given two samples of a biometric trait are acquired from the same person or we wish to determine the identity of an acquired sample. The former is called as verification while the latter is known as identification. Of these two, identification is more challenging and it is based on the biometric template database collected during the time of registration or enrolment. It can be observed that if the size of the gallery database is n then, to establish the identity of any acquired sample, it has to be compared with all the existing n samples. Due to this reason, identification takes more time when the size of the database increases. But, the increase in the size of database is bound to happen with time because a working system would keep registering more and more users in the database over time. Consider an example, assume a CCTV camera has recorded a face image of some suspicious person roaming around. A natural question would be to identify that person. To answer this, one would need to refer a facial database and compare the face in question with all faces present in the database. The time needed to perform this investigation would be proportional to the number of images in the database, and this time would increase if more face images are included in the database. This process of identity discovery is called as identification.
To automate the identification process, one has to determine suitable feature that represents the sample and then develop a matching strategy. However, to reduce the time taken for comparison, there is a need to develop a technique that can efficiently produce a small candidate list that would have high similarity score for the query image. This will result in filtering out highly dissimilar samples from being compared, thus speeding up the identification process and saving time. Such a technique is called as indexing. Indexing has two coupled phases. In the first phase, a suitable index structure is created, i.e., every sample is associated with a feature vector which conforms to an index of the database. The second phase, called as retrieval, produces a candidate list by fetching the samples from the database that are contained in the most similar index as that of the query sample. Both the methods are related and complement each other. Figure 11.2 depicts the process of indexing on a biometric database. The process starts with extracting features from the biometric sample and constructing a feature vector that best represents that sample. The generated feature vector is used for index generation and the index is stored in an index table. Whenever a query biometric sample is given to an identification system, its corresponding features are also extracted using the same feature extraction module and is mapped to the most similar index or bin in the index table. The templates stored in those bins are retrieved as a candidate list for comparison. The size of the candidate list is small as compared to the original database size. Developing an indexing scheme for a biometric database is challenging because of the inherent properties of the biometric traits and features.
One such challenge is due to high dimensionality of the biometric features as a result of which even multi-dimensional data structures like KD-tree and R-tree do not minimize the search space, and the efficiency is no better than exhaustive search. Another challenge is that features obtained from the two samples taken at different time instants of the same subject and trait are not guaranteed to be same. They are similar but not the same. Moreover, there could be the presence of some false
FIGURE 11.2 Typical block diagram of a biometric database indexing scheme.
features and some true features that may miss out. One more problem is related to the feature dimensions that it is not ordered. Of the available features, it is not easy to decide which one to consider first and which one has to be taken second. Understand this with the help of a point (x, y) in 2D space; note that (y, x) is an entirely different point and if you do not know which value of .v and which is у then you would not be able to locate the actual point. Another challenge with biometric sample in general is the creative user behaviour and unique interaction with the sensor. This may lead to various issues such as pose, occlusion, illumination, rotation, and shear transformations. Interoperability could also be an issue that arises when we use different kind of sensors to acquire different samples.
It should be noted that the biometric indexing is not the same as the general database indexing. The difference lies in the size of the retrieved list. If the index of a query item is obtained in the general database setting, it retrieves a unique database item having the same hash value equal to the query item. However, in biometric indexing, it would return a list of similar items from the database, none of them may have the same hash value. The purpose here is to continue with the search by further matching with the retrieved items for identification.
Evaluation of an indexing scheme is done based on some parameters such as hit rate, penetration rate, and bin miss rate. Hit rate is the percentage of genuine matches that are successful at top t matches from the total number of queries made. Penetration rate refers to the percentage of database that must be returned as the candidate list for a successful retrieval. Bin miss rate is another parameter that represents fraction of genuine biometric templates misplaced in a wrong class.
This chapter introduces biometric indexing and its challenges. It also provides details of an indexing technique for some popular biometric traits databases. Four physiological biometric traits are considered, viz., face, fingerprint, finger-knuckle print, and iris along with a behavioural trait, viz. signature. The next section describes facial image indexing scheme by learning predictable binary code. Section 11.3 explains fingerprint indexing using the special code called Coaxial Gaussian Track Code (CGTC). A method for finger-knuckle print indexing is explained in the subsequent section using boosted geometric hashing. Section 11.5 focuses on Iris database indexing by making use of local features. A technique to index behavioural biometric trait, viz. signature, is explained in Section 11.6. Conclusions are presented at the end.