Multi-spectral Short- Wave Infrared Sensors and Convolutional Neural Networks for Biometric Presentation Attack Detection


There is currently no doubt about the importance of subject authentication in a wide range of applications or about the numerous advantages offered by biometric recognition with respect to password or token-based systems (Jain 2007). We can safely say that biometrics overcomes problems like forgetting or losing a key, and that it provides a stronger link between the subject and the claimed identifier. Such and other advantages have led to an ever-growing deployment of biometric recognition systems in the market and in national-wide identification scenarios (Government of India 2012).

However, biometric systems are still vulnerable to external attacks. Among the possible attack points described by Ratha, Connell, and Bolle (2001), which include both inner modules of the system and communication channels, the biometric capture device is probably the most exposed one. The main difference with respect to any other attack lies on the knowledge required by the individual launching the attack: he does not need to know anything about the inner functioning of the system. Such attacks directed to the capture device are known in the literature as presentation attacks (PAs) and defined within the ISO/IEC 30107 standard on biometric presentation attack detection (PAD) as the “presentation to the biometric data capture subsystem with the goal of interfering with the operation of the biometric system” (ISO/ IEC JTC1 SC37 Biometrics 2016). In other words, an attacker can present the capture device with a presentation attack instrument (PAI), such as a face mask, a gummy finger, or a fingerprint overlay, instead of his own bona fide biometric characteristic. His intentions may be to impersonate someone else (i.e., active impostor) or to avoid being recognised due to black-listing (i.e., identity concealer).

Given the serious threat posed by PAs, PAD methods have been developed in the last decade to automatically distinguish between bona fide (i.e., real or live) presentations and access attempts carried out by means of PAIs (Marcel et al. 2019). Research in this new area has been fostered by the organisation of international competitions such as the LivDet series (Ghiani et al. 2017; Orru et al. 2019), and by several international projects, such as the European TABULA RASA (2010), BEAT (2012), and RESPECT (2019), or the US ODIN research program (ODNI and IARPA 2016). Such initiatives and funding programs have consequently led to the development of specific PAD methods for iris (Galbally and Gomez-Barrero 2017), fingerprint (Marasco and Ross 2015; Sousedik and Busch 2014), or face (Galbally, Marcel, and Fierrez 2014), among other biometric characteristics.

In general, PAD methods can be broadly divided into software- and hardware-based methods. Whereas the former, in the particular case of fingerprint, utilise the output of traditional optical and capacitive sensors, the latter introduce specific sensors to capture other properties of a bona fide fingerprint (Marasco and Ross 2015; Sousedik and Busch 2014). The LivDet competitions focus on software-based methods, since only conventional sensors are used to capture the fingerprint samples used in the benchmarks. For these datasets, very high detection rates, close to a 100% accuracy, have been achieved. However, it should be noted that only a limited number of different PAI species (i.e., 11) is included in those benchmarks. In a recent study (Kanich, Drahansky, and Mezl 2018), the authors analyse the vulnerabilities of commercial off-the-shelf (COTS) fingerprint sensors to PAIs fabricated with 21 different materials. Their results highlight the vulnerabilities to most of the materials used, which are in many cases not included in the LivDet benchmarks (e.g., wax). There is therefore a clear need to further analyse the detection capabilities of current and eventually new PAD techniques for larger databases, including a higher variability in terms of PAI species.

However, before developing new PAD techniques, we should remember that fingerprint sensors are designed to capture the ridge and valley patterns on the finger in order to achieve the best possible recognition accuracy. This may not be the best approach to discriminate between bona fide and attack presentations. On the contrary, the use of other technologies can help increasing the PA detection rates. In fact, it has been recently shown that images acquired within the short-wave infrared (SWIR) spectrum can yield very accurate PAD approaches both for face and fingerprint (Steiner et al. 2016; Tolosana et al. 2019). This is due to the fact that all skin types according to the Fitzpatrick scale (Fitzpatrick 1988) present very similar remission curves for these wavelengths, and at the same time quite different from other materials commonly utilised for the fabrication of PAIs (e.g., silicone or paper) (Steiner et al. 2016). Therefore, the task of discriminating skin (i.e., bona fide presentations) from other non-skin-materials (i.e., PAs) becomes easier in this part of the spectrum, in contrast to other wavelengths for which the skin types are very different among themselves and at the same time similar to, for instance, coloured silicone. We therefore analyse in this Chapter the use of SWIR finger images in combination with the latest deep learning algorithms to detect a large number of fingerprint PAI species: up to 41, fabricated with 35 different materials.

Among the different works recently carried out on fingerprint PAD for SWIR images (Tolosana et al. 2019; Hussein et al. 2018; Gomez-Barrero, Kolberg, and Busch 2018, 2019; Gomez-Barrero and Busch 2019), Tolosana et al. (2019) carried out a thorough study on the soundness of using deep convolutional neural networks (CNNs) in combination with SWIR images. In particular, the sensor utilised in that work captures four grayscale images of the finger at different SWIR wavelengths. Given that most pre-trained CNN models expect RGB images (i.e., three channels: red, green, and blue), the authors defined a handcrafted pre-processing of the samples to convert the four grayscale images into three channels. These RGB images were used as input to three different CNN models [i.e., VGG19 (Simonyan and Zisserman 2015). MobileNet (Howard et al. 2017), and a self-designed ResNet (Szegedy, Ioffe, and Vanhoucke 2016)]. On the experimental evaluation, tested on a large dataset, including 35 PAI species, remarkably low error rates were achieved. Given that this is also the work evaluated on the largest database in terms of PAI species so far, we build upon it for developing an improved PAD method.

In this chapter, we propose an automatic pre-processing of the four grayscale images via an additional convolutional layer, integrated w'ith the CNN model and trained together (end-to-end approach). This way, the four grayscale images can be regarded as a single four-channel image, and the network can learn the most discriminant features for the subsequent layers to process, thereby enhancing the overall detection performance. In addition to the three networks analysed by Tolosana et al. (2019) (i.e., a ResNet trained from scratch and the pre-trained MobileNet and VGG19 models), we have studied (i) the newer MobileNetV2 model (Sandler et al. 2018), which includes residual connections in the form of inverted bottlenecks and (ii) the VGGFace network (Parkh, Vedaldi, and Zisserman 2015), pre-trained on facial images for recognition purposes. Since VGGFace has been trained on more skin data, this could be beneficial for the PAD task. Then, all PAD partial scores (i.e., one per CNN model) are combined with a weighted sum rule to achieve a more robust PAD scheme.

In addition to the aforementioned improvements on the software side of the PAD method, the capture device used to acquire the fingerprint data has also been improved. The main limitation of the sensor used in (Tolosana et al. 2019; Gomez- Barrero and Busch 2019) was the low resolution of the images (i.e., 64 x 64 px.), with the consequent loss on textural information. The capture device developed within the BATL project has been accordingly improved to capture 320 x 245 px. images with a better focus on the region of interest (ROI); i.e., the fingerprint. The performance of the proposed PAD approach is thus evaluated on a newly acquired database comprising 8,214 bona fide and 3,310 PA samples, stemming from 41 different PAI species. This new dataset hence includes a higher number of PA samples, stemming from more PAI species, which allows for a more realistic evaluation of the detection capabilities of the proposed method.

< Prev   CONTENTS   Source   Next >