Proposed Method

The proposed prediction model consists of three stages. The first phase uses SSIM based template matching to reduce the number of possible prediction labels to 20. These probable character classes are further reduced to ten based on their geometrical features. In this stage, we used projection profile matching to match the similar characters geometrically. The final prediction of character is performed in the last stage by a machine learning approach. We employed LFD features and GRNN to predict the final class. The overall block diagram of the proposed three-stage hybrid model is depicted in Figure 3.7. Each stage of the model is described in Algorithms

3.1 to 3.3.

Block diagram for the proposed hybrid model

FIGURE 3.7 Block diagram for the proposed hybrid model.

Algorithm 3.1: Stage I detection algorithm

Data: Template images {Ji, 4, ■ ■ ■ ,Im} and Test character image (T) in preprocessed format Result: Best 20 probable characters for 4 <- do

Resize h and T to 60 x 60;

Sk k,T); end

Return best 20 character classes that have maximum Sk

Algorithm 3.2: Stage II detection algorithm

Data: 20 Template images {/], /2,..., /20} from Stage I and Test character image (T) in preprocessed format Result: Best, probable characters Top 1, Leftt, Bottom), Right t) =ProjcctionCount(T); for Ik «- {h,h, ■ . ■, /20} do

[Topk, Leftk, Bottomk. i?tr//iffc]=ProjectionCount(/fc);

Tk =

KRCC([Topt, Leftt, Bottom^ Righ,ti, [Topic, Leftk, Bottom к, Right к]]

end

Discard all option having 0 or -ve KRCC;

Return best at max 10 character classes that have maximum 7>;

Algorithm 3.3: Stage III detection algorithm

Data: 10 Feature set for {(DIi, У)), {DI2, Y%),..., (D/ю, У10)} from Stage II and Test character image (T) in preprocessed format, where Шг = {DI}, Dll... Dip}

Initialization: <7 = /2

Result: Final character label prediction Y

DIr <-LFD(T);

for i «— 1,2,..., 10 do

for each sample j in DIi do

Db <- lYX=l{DIfk - Dip-

Yij = onehotvcctor(i, 10); end end

  • 1 p72? >
  • E)£, E” exp(-^i)

Return Y;

Experiments

Dataset Creation

From the earlier studies, we observed that most of the research on Odia character recognition is based on 47-51 basic characters. However, in Odia, each basic character can be combined with any other to produce a compound character and this makes the total number of classes to be the factorial of 51. But not all possible characters are used in Odia literature. From our study, we found a maximum of 432 characters that

Sample images from the dataset

FIGURE 3.8 Sample images from the dataset.

are used, out of which 211 are used most of the time and the rest are used only rarely. Without recognition of these 211 characters, Odia OCR is not effective for real-world use. Moreover, the compound characters do not follow any specific rule of combination, hence making the recognition process more difficult. Since there are no datasets publicly available that cover any of the compound characters, we developed a dataset by collecting samples from each identified class. Our dataset covers font and size variation along with heavy print, thinning, and rotation errors. For each class, our dataset contains 250 samples which form a total sample size of 52,750. Some example images from the dataset are illustrated in Figure 3.8.

Experimental Setup

For our experiment, we used a ten-fold cross-validation strategy to avoid the overfitting problem. Each of the methods and hybrid models is implemented using Matlab 2015b on a Windows machine. The following hyper-parameters were used for the proposed model. The first stage used SSIM for template matching, and the parameters for it have been kept fixed. In the second stage, KRCC ranking was used which does not require any hyper-parameters, while in the third stage the GRNN model was utilized, which is a one-pass learning approach and which required a smoothing factor to predict the class label of the unknown sample. Here, the smoothing factor was set as -s/2 . We verified the effectiveness of the single-stage feature-based recognition models on both 47 classes and 211 classes of characters. It was observed that the existing methods achieved higher accuracy for 47 classes of characters; however, they earned comparatively lower accuracy w'hile considering 211 classes. The comparative results are shown in Table 3.1. Hence, a multi-stage hybrid model is proposed to improve the recognition accuracy across both problems. We explored many hybrid models and found that the model that combines SSIM matching, projection profile ranking with KRCC, and GRNN with an LFD feature yields the best accuracy. The SSIM matching reduces the number of probable characters to 20 based on overall structural similarity. Then, the projection-profile based ranking method is employed to further reduce the probable characters to ten. Finally, LFD features and

TABLE 3.1

Comparison with the State-of-the-Art Methods on 211 Classes

Reference

Feature

Classifier

Accuracy for Odia Basic Characters (47 Classes) (%)

Accuracy for Odia Basic and Compound Characters (211 Classes) (%)

15]

Zone feature

SVM

70.14

46.88

75

X

z

s?

II

91.78

67.27

MLP

82.33

69.99

[24]

Quadrant feature

SVM

20.70

10.30

KNN (k = 5)

96.50

65.25

MLP

83.22

33.50

[38]

Zone mean feature (16 zone)

SVM

49.70

32.33

KNN (k = 5)

98.63

61.62

MLP

82.80

68.22

[35]

Directional

feature

SVM

98.14

89.21

KNN (k = 5)

97.91

88.83

MLP

96.14

89.10

[40]

LU feature

SVM

50.21

18.57

KNN(A = 5)

81.34

10.71

MLP

47.20

14.73

[26]

DCT

SVM

97.31

85.49

KNN (k = 5)

97.71

87.81

MLP

95.12

87.85

[48]

DCST + PCA

SVM

96.26

82.52

KNN (A-= 5)

96.70

89.50

MLP

95.60

83.66

[13]

DOST + PCA

SVM

93.40

80.68

KNN (к = 5)

95.05

86.90

MLP

97.75

81.83

Proposed Model

97.95

90.60

the GRNN classifier is employed to predict the final character label. The comparison results for different possible hybrid models are given in Table 3.2.

Results and Discussion

From Table 3.1, it can be observed that the existing models obtain higher accuracy for 47 classes; however, they perform poorly while dealing with 211 character classes. A hybrid model helps in increasing the recognition accuracy using multiple stages. In this study, we proposed two different types of hybrid models based on the number of stages required: a two-stage hybrid model and a three-stage hybrid model. In the two-stage feature-based model, the recognition results of the first stage are further refined by using LFD and GRNN. In this case, different feature descriptors such as DCST and DOST along with an SVM classifier were taken into

TABLE 3.2

Recognition Performance of Different Possible Hybrid Models

Hybrid Model

Accuracy (%)

Two-stage hybrid classification

Stage I: DCST + PCA and SVM Stage II: LFD and GRNN

75.38

Stage I: DCST + PCA and SVM Stage II: LFD and SVM

78.73

Stage I: DOST + PCA and SVM Stage II: LFD and SVM

80.45

Stage I: DOST + PCA and SVM Stage II: LFD and GRNN

83.64

Three-stage hybrid classification

Stage I: SSIM matching

Stage II: Projection profile matching (KROCC)

Stage III: LFD and GRNN

90.60

Stage I: SSIM matching

Stage II: Projection profile matching (KROCC)

Stage III: LFD and SVM

80.80

consideration. However, these models achieve a reduced accuracy as compared to the single-stage models which are shown in Table 3.2. This is because stage I produces similar probability values for the characters with the same geometric structures. Hence, we designed a three-stage hybrid model to overcome the limitations of the two-stage model. Here, we used template matching and ranking methods to reduce the number of predicted character classes. Then, w'e employed SSIM matching and profile projection ranking to further reduce the highly improbable characters before predicting the final class using LED and GRNN/SVM. The results of the hybrid models are shown in Table 3.2. It is observed that the three-stage model w'ith LED and GRNN provides the highest recognition performance (i.e., 90.6%) in comparison to the other hybrid models and state-of-the-art methods.

The proposed model is applied to form an OCR for Odia characters. Figures 3.9-3.14 show the results for a sample input and output at different stages of the proposed model.

Result of line separation from binary image

FIGURE 3.10 Result of line separation from binary image.

Result of word separation from each line of image

FIGURE 3.11 Result of word separation from each line of image.

Result of separation of compound and character from each word

FIGURE 3.12 Result of separation of compound and character from each word.

Result of separation of special symbol by connected component analysis

FIGURE 3.13 Result of separation of special symbol by connected component analysis.

(a) shows the original input to OCR application with a background, whereas Figur

Figure 3.9(a) shows the original input to OCR application with a background, whereas Figure 3.9(b) depicts the output of the binarization step. We used Tsallis entropy and a differential evolution optimization technique to separate the background from the foreground text. In the next stage, the lines are separated by line- wise histogram analysis. The output of the line separation is given in Figure 3.10. For each line, column-wise histogram analysis is performed, and a threshold distance for space is evaluated. Based on this, we separated words from each line and the output is illustrated in Figure 3.12. We used connected component analysis to separate characters from words and the corresponding output is shown in Figures 3.12 and 3.13. Each separated character is recognized by the classification model and the corresponding UNI-code is used to generate the text file. The final output of the application is given in Figure 3.14. The recognition errors in the text are highlighted in red.

In this study, we used 211 characters to validate the model in contrast to the previous models that considered only 47-51 characters. It was observed from the results how the existing models failed to perform on 211 characters and how our model helped to improve the recognition accuracy. The usability of the model was shown through an application example, which can be used for saving historical documents in digital form. It can also help to manage different printed documents by converting them into an editable text format.

Conclusion and Future Scope

In this chapter, we proposed a hybrid three-stage model for recognition of Odia compound and basic characters. The model first utilized the structural similarity index together with template matching to find 20 alike characters for a given testing sample. Subsequently, we applied a projection matching technique to obtain the most alike characters. Eventually, we predicted the actual class using LFD features and GRNN. We validated the proposed model using a dataset spanning 52,750 images from 211 classes. The proposed scheme obtained a higher accuracy of 90.6% w'hen compared with the state-of-the-art methods and can hence be used to better recognize the historical and contemporary Odia documents. The earned accuracy is still far from achieving human-like precision, hence, in the future, further research needs to be carried out towards the improvement of model performance. The separation between the touched character and overlapping words could be made for accurate recognition of characters. Deep learning algorithms have recently achieved dramatic success in a variety of computer vision applications and, hence, the performance on the considered dataset could be tested using different contemporary deep learning algorithms.

References

  • 1. Alginahi, Y.: Preprocessing Techniques in Character Recognition. INTECH Open Access Publisher (2010).
  • 2. Antani, S., Agnihotri. L.: Gujarati character recognition. In: Proceedings of the Fifth International Conference on Document Analysis and Recognition, pp. 418-421. Los Alamitos, CA: IEEE (1999).
  • 3. Ashwin. T, Sastry. P: A font and size-independent OCR system for printed Kannada documents using support vector machines. Sadhana 27(1), 35-58 (2002).
  • 4. Basa. D.. Meher, S.: Handwritten Odia character recognition. Recent Advances in Microwave Tubes, Devices and Communication, Jaipur. India (2011).
  • 5. Bhowmik. T.K.. Parui, S.K.. Bhattacharya, U., Shaw, B.: An hmm based recognition scheme for handwritten Oriya numerals. In: International Conference on Information Technology, pp. 105-110. Bhubaneswar. India: IEEE (2006).
  • 6. Chaudhuri, B., Pal. U.: A complete printed Bangla OCR system. Pattern Recognition 31(5). 531-549(1998).
  • 7. Chaudhuri, B.. Pal, U„ Mitra. M.: Automatic recognition of printed Oriya script. Sadhana 27( 1), 23-34 (2002).
  • 8. Cheriet, M., Kharma. N.. Liu. C.L.. Suen, C.: Character Recognition Systems: A Guide for Students and Practitioners. Hoboken. NJ: John Wiley & Sons (2007).
  • 9. Chinnuswamy, P. Krishnamoorthy, S.G.: Recognition of handprinted Tamil characters. Pattern Recognition 12(3), 141-152(1980).
  • 10. Das, D.. Nayak, D.R.. Dash. R.. Majhi, B .: An empirical evaluation of extreme learning machine: Application to handwritten character recognition. Multimedia Tools and Applications 78(14), 19495-19523 (2019).
  • 11. Das, D.. Nayak, D.R.. Dash, R., Majhi, B.. Zhang. Y.D.: H-wwdnet: A holistic convolutional neural network approach for handw ritten word recognition. IETImage Processing (March 2020). https://digital-library.theiet.org/content/journals/10.1049/iet-ipr.2019.1398.
  • 12. Dash. K.S.. Puhan, N.. Panda. G.: A hybrid feature and discriminant classifier for high accuracy handwritten Odia numeral recognition. In IEEE Region 10 Symposium, Kuala Lumpur. Malaysia, pp. 531-535. IEEE (2014).
  • 13. Dash. K.S., Puhan. N.. Panda, G.: Non-redundant stockwell transform based feature extraction for handwritten digit recognition. In International Conference on Signal Processing and Communications, Bangkok, Thailand, pp. 1-4. IEEE (2014).
  • 14. Dash. K.S., Puhan. N., Panda, G.: On extraction of features for handwritten Odia numeral recognition in transformed domain. In Eighth International Conference on Advances in Pattern Recognition, Kolkata. India, pp. 1-6. IEEE (2015).
  • 15. Dash. K., Puhan. N.. Panda, G.: BESAC: Binary external symmetry axis constellation for unconstrained handwritten character recognition. Pattern Recognition Letters 83. 413-422(2016).
  • 16. Dholakia, J., Yajnik. A., Negi. A.: Wavelet feature based confusion character sets for Gujarati script. In International Conference on Conference on Computational Intelligence and Multimedia Applications, vol. 2, pp. 366-370. IEEE (2007).
  • 17. Garain. U., Chaudhuri, B.: Compound character recognition by run-number-based metric distance. In Photonics West’98 Electronic Imaging, San Jose. CA. pp. 90-97. International Society for Optics and Photonics (1998).
  • 18. Khurshid. K., Siddiqi, I.. Faure. C„ Vincent, N.: Comparison of niblack inspired bina- rization methods for ancient documents. In IS&T/SPIE Electronic Imaging, San Jose, CA, pp. 72470U-72470U. International Society for Optics and Photonics (2009).
  • 19. Kittler, J., Illingworth, J.: Minimum error thresholding. Pattern Recognition 19(1), 41- 47 (1986).
  • 20. Kumar, B., Kumar, N., Palai. C., Jena, P.K.. Chattopadhyay, S.: Optical character recognition using ant miner algorithm: A case study on Oriya character recognition. Int. J. Comput. Appl 57(7). 33—41 (2012).
  • 21. Lehal, G.. Singh, C.: Feature extraction and classification for OCR of Gurmukhi script. VIVEK-BOMBAY 12(2), 2-12 (1999).
  • 22. Maani, R.. Kalra, S.. Yang, Y.H.: Rotation invariant local frequency descriptors for texture classification. IEEE Transactions on Image Processing 22(6), 2409-2419 (2013).
  • 23. Maani, R., Kalra, S., Yang, Y.H.: Robust volumetric texture classification of magnetic resonance images of the brain using local frequency descriptor. IEEE Transactions on Image Processing 23(10), 4625-4636 (2014).
  • 24. Mahto, M.K.. Kumari. A., Panigrahi. S.: Asystem for Oriya handw'ritten numeral recog- nization for Indian postal automation. International Journal of Applied Science & Technology Research Excellence 1(1), 17-23 (2011).
  • 25. Meher, S., Basa, D.: An intelligent scanner with handwritten Odia character recognition capability. In 2011 Fifth International Conference on Sensing Technology ilCST), Palmerston North, New Zealand, pp. 53-59. IEEE (2011).
  • 26. Mishra. T.K.. Majhi, B., Panda. S.: A comparative analysis of image transformations for handwritten Odia numeral recognition. In International Conference on Advances in Computing, Communications and Informatics, Chennai. India, pp. 790-793. IEEE
  • (2013) .
  • 27. Mishra. T.K., Majhi, B., Dash, R.: Shape descriptors-based generalised scheme for handwritten character recognition. International Journal of Computational Vision and Robotics 6(1-2), 168-179(2016).
  • 28. Mishra, T.K.. Majhi, B., Sa, P.K., Panda. S.: Model based Odia numeral recognition using fuzzy aggregated features. Frontiers of Computer Science 8(6), 916-922
  • (2014) .
  • 29. Mohanty, S.: Pattern recognition in alphabets of Oriya language using Kohonen neural network. International Journal of Pattern Recognition and Artificial Intelligence 12(07), 1007-1015(1998).
  • 30. Mori, S.. Suen, C.Y.. Yamamoto, K.: Historical review of OCR research and development. In Document Image Analysis, pp. 244-273. IEEE Computer Society Press.
  • 31. Mousa. M.A.. Sayed, M.S., Abdalla, M.I.: An efficient algorithm for Arabic optical font recognition using scale-invariant detector. International Journal on Document Analysis and Recognition (IJDAR) 18(3), 263-270 (2015).
  • 32. Padhi, D.: Novel hybrid approach for Odia handwritten character recognition system. International Journal of Advanced Research in Computer Science and Software Engineering 2(5), 150-157 (2012).
  • 33. Padhi, D., Senapati, D.: Zone centroid distance and standard deviation based feature matrix for Odia handwritten character recognition. In Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA), Odisha. India, pp. 649-658. Springer (2013)
  • 34. Pal. U.. Chaudhuri, B.: Indian script character recognition: A survey. Pattern Recognition 37(9), 1887-1899(2004).
  • 35. Pal. U„ Sharma. N.. Wakabayashi, T.. Kimura, E: Handwritten numeral recognition of six popular Indian scripts. In Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Curitiba, Brazil, vol. 2. pp. 749-753 (September 2007).
  • 36. Pal. U., Wakabayashi, T, Kimura. E: A system for off-line Oriya handwritten character recognition using curvature feature. In 10th International Conference on Information Technology, Beijing, China, pp. 227-229. IEEE (2007).
  • 37. Pujari. P. Majhi, B.: A survey on Odia character recognition. International Journal of Emerging Science and Engineering 3(4), 15-25 (2015).
  • 38. Roy. K.. Pal, T, Pal. U.. Kimura. E: Oriya handwritten numeral recognition system. In Eighth International Conference on Document Analysis and Recognition, Seoul, Korea, pp. 770-774. IEEE (2005).
  • 39. Sarangi. P. K.. Ahmed, R: Recognition of handwritten Odia numerals using artificial intelligence techniques. International Journal of Computer Science 2(02), 35-38 (2013).
  • 40. Sarangi. P. K.. Ahmed, P. Ravulakollu. К. K.: Naive Bayes classifier with LU factorization for recognition of handwritten Odia numerals. Indian Journal of Science and Technology 7(1), 35-38 (2014).
  • 41. Sarangi. P. K., Sahoo. A. K., Ahmed, R: Recognition of isolated handwritten Oriya numerals using Hopfield neural network. International Journal of Computer Applications 40(8), 36-42 (2012).
  • 42. Sarkar, S., Das, S.. Paul, S., Polley, S., Burman, R.. Chaudhuri, S. S.: Multi-level image segmentation based on fuzzy-tsallis entropy and differential evolution. In IEEE International Conference on Fuzzy Systems (FUZZ). Hyderabad, India, pp. 1-8. IEEE (2013).
  • 43. Sauvola. J., Pietikiainen. M.: Adaptive document image binarization. Pattern Recognition 33(2), 225-236 (2000).
  • 44. Sekita. I., Kurita, T., Otsu, N.. Abdelmalek. N.: A thresholding method using the mixture of normal density functions. International Symposium on Speech, Image Processing and Neural Networks 1. 304-307 (1994).
  • 45. Specht, D. F.: A general regression neural network. IEEE Transactions on Neural Networks 2(6), 568-576 (1991).
  • 46. Sukhaswami. M., Seetharamulu, P.. Pujari, A. K.: Recognition of Telugu characters using neural networks. International Journal of Neural Systems 6(03), 317-357 (1995).
  • 47. Wang, Z., Bovik. A. C„ Sheikh, H. R., Simoncelli. E. P.: Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13(4), 600-612(2004).
  • 48. Mohapatra, R. K.. Majhi. B., Jena, S. K.: Classification performance analysis of MNIST Dataset utilizing a multi-resolution technique. In 2015 International Conference on Computing, Communication and Security (ICCCS). Pamplemousses, pp. 1-5 (2015).
 
Source
< Prev   CONTENTS   Source   Next >