PROPOSED ALGORITHM

Problem Formulation and Objective Function

We propose a latent fingerprint enhancement algorithm based on conditional GAN [19,20]. Given a latent fingerprint, the proposed algorithm generates a fingerprint image with clear ridge structure and removes structured and non-structured background noise present in a latent fingerprint. The motivation behind using a conditional GAN is that the generator has to not only generate a “real-looking” binarised fingerprint image but it should also generate a fingerprint which has a similar ridge structure as the input latent fingerprint image. Thus, we formulate latent fingerprint enhancement as a conditional GAN-based image-to-image translation problem [21].

The proposed model has two networks: a latent fingerprint enhancer network and an enhanced fingerprint discriminator (See Figure 3.3). For a given latent fingerprint image x, the enhancer network generates a binarised enhanced image [SnhL (jc)]. The enhancer network learns the transformation from a latent fingerprint to a binarised enhanced image, while preserving the overall ridge structure and ridge features including minutiae, without compromising the identity information in the fingerprint. The discriminator network classifies a given enhanced image as real or fake. Figure 3.3 depicts the proposed model for latent fingerprint enhancement. The loss function optimised by the proposed model is described below:

1. Adversarial Loss:

The enhancer network is trained such that the adversarial loss is minimised. On the other hand, the discriminator network is trained to maximise the adversarial loss. A penalty is imposed on the enhancer network if the image generated by the enhancer network [£nhL(x)] is deemed fake by the

Proposed model for enhancement of latent fingerprints. The back propagation of losses while training enhancer network and discriminator network is shown by dotted lines

FIGURE 3.3 Proposed model for enhancement of latent fingerprints. The back propagation of losses while training enhancer network and discriminator network is shown by dotted lines.

discriminator. Due to this loss, the enhancer network learns the necessary transformation and associated features required to generate an enhanced fingerprint from a given latent fingerprint image.

The discriminator network is penalised if it misclassifies an enhanced fingerprint image generated by the enhancer network as a real fingerprint. As a result, discriminator learns the discriminating features for differentiating the enhanced images produced by the enhancer from the ground-truth binarised images.

Note that the discriminator is conditioned by the input latent fingerprint image so that the discriminator network doesn’t just classify an enhanced image as real or fake, but the discriminator can also classify whether the enhanced fingerprint image has the ridge structure similar to the input latent fingerprint image.

2. Enhanced Fingerprint Reconstruction Loss:

The task of generating a binarised enhanced image corresponding to an input latent fingerprint image is an ill-posed problem with only adversarial loss. We include fingerprint reconstruction loss into the objective function. This loss only penalises the enhancer network. It guides the enhancer network to generate enhanced fingerprint similar to the ground-truth binarised fingerprint image. The reconstruction loss facilitates the enhancer network to learn to preserve low-frequency details in the enhanced image. 11 norm is used in the loss function to encourage the enhancer to produce sharp images. 12 norm is not used as it generates blurred images.

3. Overall Loss: The Final Objective Function is given as:

where a and /1 denote the parameters of enhancer and discriminator, respectively. Я is the weight parameter for the reconstruction loss.

Reconstruction loss helps to preserve the low-frequency details in the fingerprint image. However, fingerprints are oriented textured patterns which have a lot of high-frequency details. To ensure that the proposed model is able to capture high-frequency details, we use a patch GAN-based model which classifies each 8x8 patch as real or fake. Furthermore, reconstruction loss is a pixel-based loss which assumes that each output pixel is independent of its neighbouring pixels. Patch GAN, on the other hand, considers the joint distribution of the pixels in a patch which introduces a texture loss which in turn forces the enhancer network to preserve fine ridge details including minutiae and thus helps to preserve the identity information in the fingerprint image.

Training Data Preparation

The proposed model is a supervised generative model which is trained to output an enhanced image given an input latent fingerprint image. Being a supervised model, it requires paired training data of latent fingerprints and their corresponding enhanced binarised images. However, there are no publicly available latent fingerprint datasets which have latent fingerprints and their corresponding enhanced images. Additionally, lack of large latent fingerprint database further complicates the training of a deep neural network-based latent fingerprint enhancement model. Thus, we need to generate synthetic latent fingerprints which have similar noise characteristics as observed in real latent fingerprints (see Figure 3.4) for training the proposed enhancement model.

The proposed model is trained on 9,042 synthetic latent fingerprint images and 2,423 fingerprint images from National Institute of Standards and Technology Special Database 4 (NIST SD4) and their corresponding binarised fingerprints. Due to training on synthetic latent fingerprints, the training of the proposed model is not affected by the limited availability of the latent fingerprint database. We now give details on preparing the training data for the proposed model.

Sample images showcasing the training dataset

FIGURE 3.4 Sample images showcasing the training dataset. The 11 fingerprints (from top-left) have the same binarised ground-truth image (bottom-right image). Varying textures and backgrounds are used for training the algorithm for simulating conditions of acquisition of latent fingerprint.

1. Datasets for Preparing the Training Data:

i. Anguli: Anguli [22] is an open-source implementation of the state-of-the- art synthetic fingerprint generator SFinGe [23], which simulates synthetic live fingerprints with similar features as real-live fingerprints. It can generate multiple impressions of a fingerprint with varying levels of noise.

ii. NIST SD4: NIST SD4 [24] is a publicly available fingerprint database which has 2,000 rolled fingerprints. These are inked fingerprints with uniformly distributed fingerprint pattern type, namely left loop, right loop, arch, tented arch, and whorl. Due to the uniform distribution of the pattern type, the training dataset covers varieties of ridge patterns. Furthermore, as these fingerprints are inked prints, they have similar characteristics of non-uniform ink to latent fingerprints which have non-uniform powder content in many patches. We use NIST SD4 fingerprints with NIST Finger Image Quality 2 (NFIQ2) [25] quality score greater than or equal to 70. (NFIQ2 is an open-source state-of- the-art fingerprint quality assessment algorithm which gives a quality score in the range 1-100 to each fingerprint image where 1 denotes the worst quality and 100 denotes the best quality.) Although it is helpful to include poor quality inked prints as the training data, however, the ground-truth binarisation achieved through NBIS on poor quality fingerprints is poor which can adversely affect the performance of the model. So, we only use good quality NIST SD4 fingerprints for training the model.

2. Generation of Synthetic Latent Fingerprints:

Latent fingerprints due to their acquisition conditions are often blurred and have structured noise such as lines, overlapping text, and sometimes overlapping fingerprints. We add the following noise into good quality fingerprints generated by Anguli to create a representative synthetic latent fingerprints dataset for training the proposed model:

i. Line-Like Noise: It has been observed that line-like noise due to their similarity with fingerprint ridges often lead to failure of standard fingerprint matching algorithms. To simulate line-like noise, we blend fingerprint images with straight lines having different orientations and different widths.

ii. Blurring: Sometimes smudging of fingerprint ridges leads to missing minutiae. We observe that latent fingerprints often have non-uniform smudge patterns. To make the model invariant towards different levels of smudging, we add different levels of Gaussian noise on randomly selected fingerprint patches. The different patch sizes used are 10 x 10 and 40 x 40. The blur radius = 2 is used for Gaussian noise.

iii. Overlapping Text and Fingerprints: Latent fingerprints have complex background noise which can have overlapping text and sometimes overlapping fingerprints. To simulate those scenarios, we blend fingerprint images with text images of varying fonts and styles. We also blend fingerprint images with partial fingerprint images to address challenges of overlapping fingerprints.

iv. Different Surfaces: Latent fingerprints can be collected from different surfaces. Surfaces can be plane/curved, porous/non-porous, shiny or can have uniform background. It has been reported that the surfaces which have high reflectance generate occluded ridge patterns [1]. Furthermore, the area and the quality of latent fingerprint left on a surface vary depending on the pressure exerted by the finger, surface characteristics, and adherence of the finger’s natural secretions on that surface. Some surfaces have poor adherence property due to which the latent fingerprint deposited on such surfaces is often partial. To train the proposed model to be invariant towards various intra-class variations introduced due to various surfaces, we blend fingerprint image with varying textures such as wood surface, cardboard surface, plastic, and glass surface.

3. Ground-Truth Binarisation:

Ground-truth binarised image to train the proposed model is obtained using NIST Biometric Image Software (NBIS). A fingerprint image is binarised by NBIS based on the ridge flow direction. The image is divided into 7x9 grids, if there is a ridge pattern in a grid, the grid is rotated so that the grid is parallel to the ridge flow' direction. For the pixel of interest, the neighbourhood grey values which also lie in the rotated grid are analysed to label a pixel as black or white.

Network Architecture and Training Details

  • 1. Enhancer Network: Enhancer network has an encoder-decoder (autoencoder) architecture. Convolutional layers (Convl, Conv2, and Conv3) in the network extract features at different scales from the input latent fingerprint image capturing coarse to fine level details (See Figure 3.5). ResNet blocks help to circumvent the problem of vanishing gradient w'hile training a deep network. Decoder layers (Deconvl, Deconv2, and Conv4) transform the features extracted from the latent fingerprint to an enhanced binarised fingerprint image.
  • 2. Discriminator Network: The input latent fingerprint and binarised image are concatenated along the input channel dimension so that the discriminator can classify whether the binarised image corresponds to the input latent fingerprint image. Discriminator has a typical architecture as used in image classification. The convolutional layers in the discriminator (Conv5, Conv6, Conv7, Conv8, and Conv9) extract features at different scales capturing at different levels w'hich helps the discriminator to classify an input fingerprint image as real or fake.

The details of the network architecture are given in Table 3.2. Adam optimiser is used to optimise the objective function. The following hyper-parameters are used: learning rate = 0.02, /7, = 0.5, f)2 = 0.999, 2=10 and batch size = 2. The model is trained on two GPUs each wuth 12 GB RAM.

Architecture of enhancer (£nh) and discriminator (Dis)

FIGURE 3.5 Architecture of enhancer (£nhL) and discriminator (Disf;).

TABLE 3.2

Architecture of £nhL and Dis*

Block

Layers

Kernels

Size

Stride

Padding

Convl

Convolutional Layer + Batch Normalisation + ReLu

64

7

1

3

Conv2

Convolutional Layer + Batch Normalisation + ReLu + Convolutional Layer + Batch Table 3.2: Architecture of £nhL and Dist

128

3

2

1

Conv3

Convolutional Layer + Batch Normalisation + ReLu + Convolutional Layer + Batch Normalisation

256

3

2

1

ResNet

Block

Convolutional Layer + Batch Normalisation + ReLu + Conv Layer + Batch Normalisation

256

3

2

1

Deconv 1

Convolutional Layer + Batch Normalisation Layer + ReLu + Convolutional Layer + Batch Normalisation

128

3

2

1

Deconv2

Convolutional Layer + Batch Normalisation + ReLu + Conv Layer + Batch Normalisation

64

3

2

1

Conv4

Convolutional Layer + Tanh

1

7

1

3

Conv5

Convolutional Layer + LeakyReLu

64

4

2

1

Conv6

Convolutional Layer + Batch Normalisation + LeakyReLu

128

4

2

1

Conv7

Convolutional Layer + Batch Normalisation + LeakyReLu

256

4

2

1

Conv8

Convolutional Layer + Batch Normalisation + LeakyReLu

512

4

1

1

Conv9

Convolutional Layer

1

4

1

1

 
Source
< Prev   CONTENTS   Source   Next >