EXPERIMENTAL SETUP

Database

The database used for the experimental evaluation of the proposed method was acquired in three different sessions, spanning across four months, in collaboration with the colleagues at USC within the BATL (2017) project. In total, 8,214 bona fide and 3,310 PAs, stemming from 41 different PAI species, were captured. A total of 732 different subjects participated in the data collection, from whom the ring, middle, index fingers, and thumbs of both hands were captured. This represents the largest fingerprint dataset within the SWIR spectrum in terms of both number of samples and PAI species. In addition, the proportion of PA samples has increased from 1:10 to 1:2.5, with respect to the database analysed by Tolosana et al. (2019), and the number of PA samples in the test set has been multiplied almost 10 times.

Table 5.2 presents a summary of the PAI species included in the database, classified by the general type (i.e., full finger, paper print outs, and overlay) and the fabrication material. With respect to the database used in Tolosana et al. (2019), a new dataset has particularly increased the number of samples captured from the most challenging PAI species: overlays in general, and full fingers made of some play doh colours (the previous SWIR methods had trouble detecting some of them), and conductive materials, which pose a higher threat specially to conventional finger sensors and are therefore expected to be used by the attackers. In any case, the overall selection of PAI species follows the requirements established by the IARPA Odin program.

Table 5.3 shows the partition of the database into independent train, validation, and test sets. It should be noted that some subjects participated in two different acquisition sessions. Their samples (both the bona fide and the PA samples)

TABLE 5.2

PAI Species Included in the Database and Number of Samples Considered in Our Experimental Framework

Type

Material

# Samples

Total

Train

Validation

Test

Full finger

3D print

48

18

12

18

3D print + silver coating

24

12

8

4

Ballistic gelatine

144

26

10

108

Dental material

51

11

6

34

Dragonskin

426

88

63

275

Dragonskin + conductive coating

24

6

0

8

Dragonskin + nanotips white coating

27

9

6

12

Latex + gold coating

69

18

18

33

Monster latex

78

28

11

39

Polydimethylsiloxane (PDMS)

124

21

13

90

Playdoh black

15

6

0

9

Playdoh orange

53

17

6

30

Playdoh white

24

6

6

12

Playdoh yellow

24

3

9

12

Silicone

147

47

38

62

Silicone two part

69

17

4

48

Silicone + conductive coating

18

8

4

6

Silicone + nanotips white coating

54

12

13

29

Silicone + graphite coating

72

12

12

48

Silly putty

25

9

0

16

Silly putty glow in the dark

15

6

3

6

Silly putty metallic

15

9

0

6

Wax

74

16

11

47

Finger-vein glossy paper

37

18

4

5

(Continued)

TABLE 5.2 (Continued)

PAI Species Included in the Database and Number of Samples Considered in Our Experimental Framework

Type

Material

# Samples

Total

Train

Validation

Test

Print outs

Finger-vein matte paper

22

6

8

8

Fingerprint paper

49

11

9

29

Finger transparency

64

16

8

41

Conductive silicone

260

20

8

232

Dragonskin

170

50

31

89

Dragonskin fleshtone

10

4

2

4

Knox gelatine

21

5

4

12

Monster latex

34

15

8

11

School glue

76

25

15

36

Overlay

School glue white

25

5

6

14

Silicone

24

12

9

3

Silicone yellow

83

24

7

52

Silicone fleshtone

517

75

48

394

Silicone two part

98

20

14

64

Urethane + Ti/Au coating

72

18

9

45

Wax

18

8

3

7

Wood glue

70

21

12

37

TABLE 5.3

Partition of Training, Validation and Test Datasets

tt Samples

tt PA Samples

tt BF Samples

Training set

1.538

769

769

Validation set

940

470

470

Test set

9,046

2.071

6.975

Total

11.524

3,310

8.214

are consequently used only in one of the three sets. In order to achieve a balanced training and no bias towards one class, the number of bona fide and PA samples should be equal for the training and validation sets. Therefore, the limiting factor in the protocol design is the number of PA samples available: 3,310 in comparison with the 8,214 bona fide samples. We also want to maximise the number of samples in the test set to achieve more statistically relevant results. Therefore, 769 samples of each class are used for training and 470 for validation. The remaining samples (2,071 PAs and 6,975 bona fides) are used for testing purposes. The specific number of samples of each PAI species included in each set is shown in Table 5.2.

Evaluation Metrics

The performance of the PAD method is evaluated in compliance with the ISO/IEC IS 30107-3 on Biometric PAD - Part 3: Testing and Reporting (ISO/IEC JTC1 SC37 Biometrics 2017). To that end, we report Detection Error Trade-Off (DET) curves between the APCER and BPCER. In addition, the APCER at BPCER = 0.2% (denoted as APCER02,/() will be also reported to evaluate systems with a high user convenience, which is the target of the Odin Program.

Experimental Protocol

As already mentioned in Section 5.4, two different deep learning approaches are considered:

  • • Training complete CNN models from scratch.
  • • Transfer learning techniques on CNN models trained on multi-class tasks (i.e., ImageNet) or on two-class problems (i.e., VGGFace).

In both cases, three different sets of experiments are carried out:

  • • Baseline handcrafted RGB conversion: first, a baseline detection performance is established using the handcrafted RGB conversion proposed by Tolosana et al. (2019). The results are benchmarked with Tolosana et al. (2019) and Gomez-Barrero and Busch (2019). This way, we can assess the quality of the images acquired with a new capture device and its impact on the proposed PAD method.
  • • Input pre-processing optimisation: then, the optimal filter size P (see Section
  • 5.4.2.1 and Figure 5.3) needs to be determined for each model, in order to obtain the best possible detection performance. This is carried out individually for each CNN model described in Section 5.4.2.2.
  • • Final fused system: after determining the optimal filter size and the APCEs of each CNN model, the best fusion is carried out at score level. In addition, the results are benchmarked with the state-of-the-art reported in Tolosana et al. (2019) and Gomez-Barrero and Busch (2019).
 
Source
< Prev   CONTENTS   Source   Next >