2019年 4 月7日

Available online at www.sciencedirect.com

ScienceDirect

AASRI Procedia 4 (2013) 306 – 312

2013 AASRI Conference on Intelligent Systems and Control

Off-Line Handwritten Character Recognition using

Features Extracted from Binarization Technique

Amit Choudhary^a,*, Rahul Rishi^b, Savita Ahlawat^c

aMaharaja Surajmal Institute, New Delhi, India

bUIET, Maharshi Dayanand University, Rohtak, India

cMaharaja Surajmal Institute of Technology, New Delhi, India

Abstract

The choice of pattern classifier and the technique used to extract the features are the main factors to judge the recognition accuracy and the capability of an Optical Character Recognition (OCR) system. The main focus of this work is to extract features obtained by binarization technique for recognition of handwritten characters of English language. The recognition of handwritten character images have been done by using multi-layered feed forward artificial neural network as a classifier. Some preprocessing techniques such as thinning, foreground and background noise removal, cropping and size normalization etc. are also employed to preprocess the character images before their classification. Very promising results are achieved when binarization features and the multilayer feed forward neural network classifier is used to recognize the off-line cursive handwritten characters.

Keywords: OCR; Binarization; Feature Extraction; Character Recognition; Backpropagation Algorithm; Neural Network.

1.introduction

The significance of piece of paper cannot be overlooked towards improving peoplersquo;s memory. It is used for both private (letters, notes, addresses, reminders, lists, diaries etc.) and official correspondence (bank cheques, tax forms, admission forms etc.). The paper is important in our daily life because it is cheap, reliable, easily available, flexible in filling, secure for future references and is easy to keep. A huge amount of important historical data is also written on papers. So, there is a great demand to digitize all these paper documents so that the people all over the world can access these important sources of knowledge. For this purpose, the image of handwritten text is preprocessed and segmented into individual characters and are recognized by a neural network classifier.

The process of reading handwritten text from the static surfaces is termed as off-line cursive handwriting recognition. Simulating the behaviour of the human brain into a machine (for the task of reading handwritten or printed text) opened innovative prospects to improve man-machine interface. For the last four decades, the classification of cursive and unconstrained handwritten characters has been a major issue in this field of research.

2.Related work

The off-line character recognition is an active area of research these days. As compared to machine printed character recognition, the work done by the researchers in the area of handwritten character recognition is very limited as mentioned by Apurva A. Desai [1]. In 2002, Kundu amp; Chen [2] used HMM to recognize 100 postal words and reported 88.2 % recognition accuracy. In 2007, Tomoyuki et al. [3] used 1646 city names of European countries in the recognition experiment and the accuracy of 80.2% is achieved. In 2006, Gatos et al.[4] used K-NN classifier to recognize 3799 words from IAM database and reported 81% accuracy.

3.Handwritten Character Database Preparation

The handwritten character images are captured with the help of a digital camera. The character images can also be scanned by using a scanner. This process is known as Image Acquisition [5]. All the handwritten character images are converted to a uniform image format such as .bmp or .jpg so as to make all the images ready for the next processing step. Pure white background or some colored (noisy) background may be used to write/print these handwritten character images. These samples may be written with different pens of various colored ink. Character image samples contributed by 10 different people (age 15-50 years) are collected where each contributor writes 5 samples of the complete English alphabet (a-z). In this way 1300 (10times;5times;26=1300) character image samples are collected for the proposed experiment.

4.Preprocessing

Preprocessing is done to remove the variability that is present in off-line handwritten characters.

4.1Grayscale conversion

In this phase of preprocessing, the input image of handwritten character in .bmp format from the local database as shown in Fig 1(a) is converted to grayscale format by using “rgb2gray” function of MATLAB and the resultant handwritten character image is shown in Fig 1(b).

4.2Binarization

Binarization is an important image processing step in which the pixel values are separated into two groups; white as background and black as foreground. Only two colors, white and black, can be present in a binary image. The goal of binarization is to minimize the unwanted information present in the image while protecting the useful information. It must preserve the maximum useful information and details present in the image, and on the other hand, it must eliminate the background noise associated with the image in an efficient way.

It is assumed that the intensity of the text is less than that of backgr

原文和译文剩余内容已隐藏，您需要先支付 30元 才能查看原文和译文全部内容！立即支付

免费ai写开题、写任务书：免费Ai开题 | 免费Ai任务书 | 免费降AI率 | 免费降重复率 | 论文免费排版

注册

找回密码

卷烟32位防伪码识别系统外文翻译资料

1.introduction

2.Related work

3.Handwritten Character Database Preparation

4.Preprocessing

您可能感兴趣的文章

登录

注册

找回密码

1.introduction

2.Related work

3.Handwritten Character Database Preparation

4.Preprocessing

您可能感兴趣的文章