Recognition of Off-line printed Arabic text Using Hidden Markov Models

This paper describes a technique for automatic recognition of off-line printed Arabic text using Hidden Markov Models. In this work different sizes of overlapping and non-overlapping hierarchical windows to generate 16 features from each vertical sliding strip are used. We experimented with all test...

Full description

Saved in:
Bibliographic Details
Main Author: Al-Muhtaseb, Husni A. (author)
Other Authors: Mahmoud, Sabri A. (author), Qahwaji, Rami S. (author), unknown (author)
Format: article
Published: 2008
Subjects:
Online Access:https://eprints.kfupm.edu.sa/id/eprint/17670/1/Recognition_of_Off-line_printed_Arabic_text_Using_Hidden_Markov_Models_Abstract.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper describes a technique for automatic recognition of off-line printed Arabic text using Hidden Markov Models. In this work different sizes of overlapping and non-overlapping hierarchical windows to generate 16 features from each vertical sliding strip are used. We experimented with all tested fonts (viz. Arial, Tahoma, Akhbar, Thuluth, Naskh, Simplified Arabic, Andalus, and Traditional Arabic). It was experimentally proven that different fonts have their highest recognition rates at different numbers of states (5 or 7) and codebook sizes (128 or 256). Arabic text is cursive, and each character may have up to 4 different shapes based in its location in a word. We decided to consider each shape as a different class hence resulting in a total of 126 classes. The achieved average recognition rates (using 126 classes and 16 features for each vertical strip of three pixels width) were between 98.08% for Thuluth and 99.89% for Arial. The main contributions of this work are the novel hierarchical sliding window technique, and using 16 features only for each sliding window. Each shape of the Arabic characters is considered as a separate class, bypassing the need for segmenting Arabic text, and is applicable to other languages.