PASHTO OPTICAL CHARACTER RECOGNITION USING NEURAL NETWORK

  • Nasir Ahmad UET Peshawar
  • Mohammad Naeem University of Guelph, Canada
  • Sahibzada Abdur Rehman Abid UET Peshawar
  • Asma Gul SBBWU Peshawar
Keywords: Optical Character Recognition, Pashto OCR, Neural Network, Multilayer Feed Forward Neural Network

Abstract

This paper presents an Optical Character Recognition system for printed/scanned Pashto continuous text. In the proposed work, Pashto text is recognized using Feed Forward Neural Network (FFNN), consist of an input layer, a hidden layer and an output layer. The input layer is composed of 315 neurons, which receive the pixels data i.e. binary data from a 21x15 symbol pixel matrix. The hidden layer contains 2000 neurons which has been chosen after testing based on optimal result, while the output layer is composed of 6 neurons. As the joinable Pashto characters on different locations in text change its size and shapes, as a result 60 Pashto characters with 110 samples for each Pashto character has been used to train the network.

References

1. Yang, W., Jin, L., & Liu, M. (2015), “Chinese character-level writer identification using path signature feature, DropStroke and deep CNN”, IEEE 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia pp. 546-550.

2. Poznanski, A., & Wolf, L. (2016), “Cnn-n-gram for handwriting word recognition”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 2305-2314.

3. N. Ahmad, A. A. Khan, S. A. R. Abid, M. Yasir, Nasim-Ullah, " Pashto Isolated Character Recognition Using K-NN Classifier ", Sindh Univ. Res. Jour. (Sci. Ser.) Vol.45 (4) 679-682 (2013).

4. S. Arora, D. Bhattacharjee, M. Nasipuri, D. K. Basu, and M. Kundu (2008) “Combining Multiple Feature Extraction Techniques for Handwritten Devnagari Character Recognition”, IEEE Region 10 and the Third International Conference on Industrial and Information Systems (ICIIS 2008), 1-6.

5. Al-Jawfi R., “Off Handwriting Arabic Character Recognition LeNet Using Neural Network,” The International Arab Journal of Information Technology, vol. 6, no. 3, pp. 304-309, 2009.

6. Zafar M., Dzulkifli M., and Razid M., “Write Independent Online Handwritten Character Recognition using A Simple Approach,” The International Arab Journal of Information Technology, vol. 5, no. 3, pp. 476-484, 2006.

7. U. Pal, and A. Sarkar, (2003), “Recognition of Printed Urdu Script”, Seventh International Conference on Document Analysis and Recognition (ICDAR), 1183-1187.

8. K. Khan, R. Ullah, N. Ahmad and K. Naveed, "Urdu Character Recognition using Principal Component Analysis", International Journal of Computer Applications (0975 – 8887) Volume 60 – No.11, December, 2012.

9. Herbert and Sloan, I. (2009), “A Grammar of Pashto a Descriptive Study of the Dialect of Kandahar, Afghanistan”, Ishi Press Intl. pp. 210.

10. Paul, L. M. (ed.), (2009), “Ethnologue: Languages of the World”, (16th eddition), Dallas, Tex.: SIL International. Online version: http://www.ethnologue.com/.

11. Gong, Y. (1995). Speech recognition in noisy environments: A survey. Speech communication, 16(3), 261-291.
Published
2018-02-28
How to Cite
Ahmad, N., Naeem, M., Abid, S., & Gul, A. (2018). PASHTO OPTICAL CHARACTER RECOGNITION USING NEURAL NETWORK. JOURNAL OF ENGINEERING AND APPLIED SCIENCES, 37(1). Retrieved from https://journals.uetjournals.com/index.php/JEAS/article/view/2234