On Training Generative Adversarial Network for Enhancement of Latent Fingerprints

INTRODUCTION

Latent fingerprints are the impressions of the ridges on the fingertips which are unintentionally deposited on the surface of an object when the subject touches it. These fingerprints are lifted by forensic experts using specialised techniques like dusting or chemical processing. Latent fingerprints have unclear ridge structure, partial ridge information, and uneven contrast between ridges and valleys. They also possess structured noise due to overlapping text, lines, stains, and sometimes overlapping fingerprints in the background. Figure 3.1a showcases sample latent fingerprint images from IIITD-MSLF database [1].

Latent fingerprints picked up from the crime scene are matched with fingerprints in the law agency’s fingerprint database, to find crime suspects. Standard fingerprint matching systems are designed for good quality fingerprints. However, due to

(a) Sample latent fingerprints from IIITD-MSLF database depicting background noise, degraded fingerprint ridges, background with textures and multiple fingerprints overlapping with each other,

FIGURE 3.1 (a) Sample latent fingerprints from IIITD-MSLF database depicting background noise, degraded fingerprint ridges, background with textures and multiple fingerprints overlapping with each other, (b) Fingerprints exhibiting the improvement of minutiae detection on enhanced images generated by proposed algorithm. Left column exhibits the original fingerprints, middle column showcases the minutiae detected (shown by blue dots) on original fingerprints using the NBIS tool [2]. Right column shows improved minutiae detection post enhancement.

the poor quality of latents, standard fingerprint feature (minutiae) extractors which perform well on plain and rolled fingerprints often fail on latent fingerprints [3]. Figure 3.1b showcases that many times, true minutiae are missed due to smudged and blurred ridges and many spurious minutiae are extracted due to background noise. As a result, the matching accuracy achieved by the standard fingerprint matchers on latent fingerprints is far from satisfactory to be used for latent fingerprint matching.

Due to this, latent fingerprints are manually matched by the latent fingerprint examiners which pose a huge burden on them. Furthermore, studies have reported inconsistency across evaluations of latent fingerprint examiners [4,5]. This poses a serious need to automate the process of latent fingerprint matching which can facilitate fast and accurate matching performance over the whole fingerprint database and not just a small subset of suspects. One of the key techniques to improve the

Latent fingerprint matching framework

FIGURE 3.2 Latent fingerprint matching framework.

latent fingerprint matching performance is an enhancement module. An enhancement algorithm improves the contrast between ridges and valleys, removes background noise, and predicts the missing ridge information and thus facilitates c.orrect minutiae extraction, in turn improving the matching performance. Figure 3.2 depicts the overall framework of latent fingerprint matching.

RELATED WORK

The early literature on latent fingerprint enhancement focuses on accurate estimation of orientation field of ridges in latent fingerprints. The estimated orientations are then fed to the Gabor filter to enhance latent fingerprints. Given below are the approaches of latent fingerprint enhancement which approximate the orientation field and utilise it to enhance latent fingerprints:

Yoon et al. [6] propose an orientation estimation algorithm that requires manually marked ROI (Region of Interest) and singular points. At first, the orientation skeleton image is derived from Verifinger [7] (the state-of-the-art commercial fingerprint matching tool). From these orientations, reliable and unreliable blocks are found out. Reliable blocks have orientations coherent with the neighbouring blocks. For the unreliable blocks, the re-estimation of orientations is performed by interpolations of orientations from the reliable blocks. Using the interpolated orientations, fingerprint rotation and skin distortion model are estimated. Furthermore, computation of orientations from singular points is carried out using zero-pole technique. Finally, orientation is estimated using orientation obtained through the zero-pole method and estimated distortion model. Gabor filtering is applied on the estimated orientation to obtain the enhanced image.

Yoon et al. [8] perform orientation field estimation assuming that the manually marked ROI and singular points are available for the input latent fingerprint image. The initial orientation field is computed by the Short Time Fourier Transform (STFT) enhancement algorithm. However, the performance of STFT can be easily affected by the unstructured background noise. They employ a two-level approach in which first they merge compatible orientation elements in a neighbourhood into an orientation group. Next, they generate top-ten best global orientation using Randomised Random sample consensus (R-RANSAC). Gabor filters with all the ten orientations are employed to obtain ten enhanced latent fingerprint images. Matching is carried out with all the ten images, and the maximum match score serves as the final output match score of the latent.

Feng et al. [9] argue that the orientation estimation is analogous to spelling correction in a sentence. They propose to create a dictionary of orientation patches estimated from good quality fingerprint patches. Creating a dictionary helps to eliminate non-word errors, i.e., predicting such orientations which cannot exist in real-life. They further discuss that just as contextual information can help in spelling correction, similarly orientation of neighbouring patches should be utilised for the estimation of orientation of a given patch. To begin with, they compute an initial estimate of the orientation field using STFT. They, then, compare the initial estimate with each dictionary element and identify potential candidates. They use compatibility between neighbouring patches to find the optimal candidate. Orientation information of all orientation patches is then summarised to obtain the final orientation field.

Yang et al. [10] utilise spatial locality information present in fingerprints to improve the quality of the estimate. Authors claim that only specific orientations occur at a given location, e.g., the orientations at the middle of fingerprints will be different than the orientations at the top of fingerprints. In order to exploit this information, they introduce localised dictionaries, i.e., create a dictionary for every location in a fingerprint. Due to this, each dictionary contains only a limited number of orientations leading to faster dictionary look-ups. Moreover, this technique leads to even fewer non-word errors.

Chen et al. [11] observe that the average size of noise is not the same in all latent fingerprints. Rather, it varies across different qualities of latent fingerprints. For a poor quality image, one can obtain better results by using a dictionary with bigger patch size and vice versa. So, a dictionary created for only a particular size of orientation patches will not work for all latent fingerprints. The authors solve this problem by creating multi-scale dictionaries, i.e., dictionaries of different patch sizes. They use compatibility between neighbours across different scales to find the optimal orientation patch for a given estimate.

Cao and Jain [12] discuss the limitations of dictionary-based methods. They further argue that there is a need for methods which can learn the orientation field from poor quality latent fingerprints. They formulate estimation of orientation field from a fingerprint image as a classification problem. They address this problem using a convolutional neural network (CNN)-based classification model. The real challenge in using a deep architecture is to have a large amount of latent fingerprints for training the network. For this purpose, they propose a model to simulate texture noise as present in latent fingerprints. Several structured and unstructured noise patterns are injected into good quality fingerprints for synthesising latent fingerprints. K-means clustering is performed on orientation patches of good quality images to select 128 representative orientation patch classes. They extract 1,000 orientation patches for each orientation class and train the network with the corresponding simulated latent. After training the model for each patch in input latent fingerprint, an orientation class is predicted by the model.

Liu et al. [13] pose the estimation of orientations as a denoising problem and propose sparse coding for denoising of orientation patches. Authors create multi-scale dictionaries from good quality fingerprints. After computing the initial estimate, they then reconstruct the orientation using a dictionary of smallest size with sparse coding. The quality of an orientation patch is then estimated based on compatibility with neighbours. If the quality is below a certain threshold, then the orientation patch is reconstructed using a dictionary of bigger patches. This process is continued until the quality of the reconstructed orientation patch is satisfactory.

Chaidee et al. [14] propose sparse coded dictionary learning in the frequency domain which fuses responses from Gabor and curved filters. In the offline stage, a dictionary is constructed from the frequency response. In the online stage, spectral response is computed which is then encoded by the spectral encoder. The sparse representation of the spectral code is computed and then decoded by the spectral decoder to reconstruct the Fourier spectrum. A weighted sum of the reconstructed image obtained from both the filters is computed to obtain the final enhanced image. Recently, the attention has been shifted to straight away generate enhanced fingerprint without explicitly approximating orientation field. We now describe such latent fingerprint enhancement algorithms:

Qu et al. [15] propose a deep regression neural network which outputs orientation angle values. The input latent fingerprint image is first pre-processed using total variation decomposition and Log-Gabor filtering. The pre-processed latent is then given as an input to the network, and orientation is estimated. Boosting is performed to further improve the prediction accuracy.

Li et al. [16] propose a multi-task learning-based enhancement algorithm which works on the patch level. An input latent fingerprint image is pre-processed using Total Variation Decomposition, and the texture component is used as an input for the proposed model. Proposed solution is based on encoder-decoder architecture trained with a multi-task learning loss. One branch enhances the latent fingerprint and the other branch predicts orientation for the input image. This algorithm requires orientation field information as a part of training data to train the network to generate the enhanced fingerprint image. Thus, this algorithm is beyond the scope of this chapter.

Svoboda et al. [17] suggest an end-to-end convolutional autoencoder architecture which implicitly minimises orientation and gradient loss between the target-enhanced fingerprint and the fingerprint produced by their model. The objective function is designed such that it only minimises /2-loss and it cannot address perceptual information. A brief summary of limitation of the state-of-the-art is provided in Table 3.1.

To summarise, the traditional state-of-the-art latent fingerprint enhancement algorithms focus on accurate orientation estimation for latent fingerprints and exploit only Gabor filters to enhance latent fingerprints. Recent state-of-the-art techniques, on the other hand, propose learning-based end-to-end latent fingerprint enhancement models which directly generate enhanced fingerprints without only relying on Gabor filters. The weights of the kernels in CNNs are rather learnt for the problem in hand. However, none of the above-mentioned latent fingerprint enhancement models exploit the perceptual information in the fingerprints.

Generative adversarial networks (GANs) generate sharper images compared to autoencoders which generate blurred images. As a result, GANs are better suited for generating fingerprint images as they can generate sharp images with clear ridge

TABLE 3.1

Table Summarising the Literature on Latent Fingerprint Enhancement

Algorithm

Proposed Approach

Limitation

Reference

Classical Image Processing and hand-crafted models

Orientation estimation using zero-pole method and distortion model

Requires manually marked ROI and singular points

[6]

R-RANSAC is used to find top-ten global orientations. All the ten enhanced images are used for matching

Requires manually marked ROI and singular points. Matching with ten enhanced images is an overhead

[8]

Dictionary

Learning

Dictionary learning-based orientation estimation

Incorrect estimation around singular points, high computation time

[9]

Localised dictionary learning-based orientation estimation

Algorithm first performs pose estimation and then orientation estimation leading to high computational complexity

[10]

Multi-scale dictionary learning-based orientation estimation

Global multi-scale dictionaries are used due to which local a priori fingerprint information is not utilised

[11]

Spectral dictionary

Requires manually marked core points

[14]

Sparse coded dictionary learning-based orientation estimation

Global multi-scale sparse coded dictionaries are used due to which local a priori fingerprint information is not utilised

[13]

Deep Learning

Convolutional neural network-based classification for orientation estimation

Number of orientation patch classes is very limited, due to which the orientation estimation may not be accurate

[12]

Deep regression neural network for orientation estimation

Requires pre-processing before orientation estimation. Moreover, algorithm is not evaluated on any of the publicly available latent fingerprint databases

[15]

Multi-task learning-based autoencoder

The autoencoder is designed for pre- processed latent fingerprints

[16]

Convolutional autoencoder that minimises orientation and gradient loss

Fails to preserve minutiae in case of poor quality input images

[17]

structure and good ridge-valley contrast. This in turn facilitates improved minutiae extraction and matching performance.

The information on training GAN for latent fingerprint enhancement provided in this chapter is based on the latent fingerprint enhancement algorithm proposed by Joshi et al. [18]. The enhancement model proposed by the authors is trained not only with the reconstruction loss to preserve the ridge structure, but it also limits spurious pattern generation by employing a classification network trained with an adversarial loss to classify the reconstructed image as real or fake. Furthermore, the proposed GAN model is trained on synthetic latent fingerprint images due to which the training is not affected by the limited availability of publicly available latent fingerprint images.

 
Source
< Prev   CONTENTS   Source   Next >