Here, we briefly discuss an implementation of the UNet [83] done by us on two majorly used data sets, namely, UBIris v2 [79] along with CASIA 4 Interval [1]. We have already discussed the UNet in the above sections whose implementation has been open-sourced by the authors [33].[1] It is a simple encoder-decoder CNN-based architecture wherein novel skip connections joining encoder layers to the decoder layers which provide a global context to the already processed local information for the generation of location-precise maps. For each of the data sets, a similar procedure

TABLE 12.5

Testing Results on the Implementation of UNet

Data set

E1 % Error

E2% Error

UBIris 2



3. Casia-Iris v4(Interval)

4. 0.67

5. 1.36

as described below was followed. First, the images in the data set were reshaped into 256 x 256 and were normalised between [0,1] by dividing it by 255. Next, the data set was divided into a 30% and 70% into testing set and training set, respectively. We coded our model using Keras [23] and trained on the NVIDIA GeForce GTX 1080 Ti for 100 epochs (Table 12.5).

The loss function was Binary-cross-entropy, and El and E2 were taken as the validation metrics.

The kernel size was kept 3x3 with stride 1 for convolution and 3x3 with stride 2 for both downsampling and upsampling. Each convolution was followed by a ReLu activation [6] and Batch Normalisation [50] for better generalisation. Different optimisers such as SGD [89], RMSProp [100], and Adam [56] were used, and the best results reported below were obtained using Adam. All weights were initialised according to the He et al. [38] initialisation. In accordance with the above nomenclature, the loss function is defined as follows:


Before discussing about the future work, we again describe the various non-idealities introduced in the data set and also some future non-idealities that may be encountered in the future data sets.

Occlusion: Eyelids, eyelashes, and hair are the leading causes of occlusion in Iris images with massively varying levels of occlusions, sometimes to the tune of 80%-90%.

  • • Blurring: In many cases, either the subject is on the move in unconstrained environments or even in the case of constrained environments, and they may move a little causing the image to blur. Also, in some cases, if the equipment is not correctly set-up, the camera itself may move, adding to the blurriness.
  • • Alignment: In some cases, the subject may move their eye or even their entire head, which causes the iris to appear more oval-shaped and not occupy the centre of the image as intended due to misalignment of the face and the equipment. Needless to say, in the case of an unconstrained environment, this is highly prevalent.
  • • Resolution: For proper segmentation and subsequent verification, it is always the best to have high-resolution images so that the relevant features are easily extracted and represented. However, this is not always the case as the resolution of the acquired image is solely dependent on the camera equipment, which may vary from a high-end imaging set-up to a mobile camera.
  • • Adulteration: This is an artefact that has the potential to appear in many ways. For example, subjects wearing lenses or glasses alter the natural appearance of the iris. Although indistinguishable to the naked eye, they alter the intensity of the original iris at the pixel level, which may cause substantial hindrance to the segmentation algorithms. In the future, more “artificialness” might be introduced in presently unknown ways.

It has been widely discussed that models based on deep learning are heavily dependent on data characteristics. Iris images acquired in an unregulated condition represent nearly all the characteristics of the real world scenarios. Developed robust and complex deep learning models such as PixISegNet and Iris-DenseNet are now able to process and can be trained on the data sets generated in an unconstrained environment and hence are bound to give excellent performance when deployed to the real world. However, there is still much of room for development where the researchers can improve upon the robustness of the model to look beyond occlusion and identify the features of anatomy nearby iris. A lot of work is yet to be done GANs, a generative model wherein an encoder-decoder may act as an active generator and a separate discriminator to differentiate amongst the predicted map and the ground truth maps may result in the improved performance. Similar to this, the use of attention modules, dictionary learning, recurrent neural network, and many other existing ideas are not thoroughly explored by the researchers. Moreover, most of the models have been trained upon the existing loss functions, as discussed above, leaving room for improvement there too. Hence, there is still a vast space for the development of novel and state-of-the-art techniques and to overcome all the drawbacks of previous works.

  • [1] Link-https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/
< Prev   CONTENTS   Source   Next >