Computer Vision: OCR module

jueves, 22 de diciembre de 2011

OCR module

Today, I would like to show you my first OCR module. This system is based on HOG and PHOG features, I used Random Forest to recognition. Besides, I did some comparative of my recognition rate with the Tesseract OCR system (without any preprocessing).

Features:

The HOG features extractor module is the follow:

Resize the character image to patch of 32x32 pixels (see the examples below).
Calculate the magnitude and the orientation of the gradient of axis-X and axis-Y.
Split the magnitude and the orientation matrix into 8x8 cells. Extract the HOG of these cells and grouping them a block of 2x2 cells and then it'll be normalized.
Concact all block HOG features.

Figure 1) Samples of ICDAR 2003 data set. Easy and hard samples (a, a, p, C, R, S, E, i, E).

And, the PHOG features module is similar to HOG features: do the same procedure for every pyramid level but without grouping cells into blocks.

Experiment setup:

ICDAR 2003 database provide training and testing data set. In these two sets there are 71 classes. I used 100 trees with depth of 50 for recognition. Below, in the figure below shows the accuracy rate vs number of orientation bins.

After looking at these accuracy rate, there are two points to discuss. The first one, the Tesseract library is not good to recognize natural scene characters, since it is trained with scanned documents. The second one, it seems that HOG features code characters better than PHOG. Maybe the grouping cells into block stage is more discriminative than pyramid grouping stage.

Finally, the best accuracy rate of HOG and PHOG-RandomForest is about 71% and 61% respectly (quantified with 10 bins), and Tesseract achieves 37.25%.

My best accuracy rate richs 71%, however the Stanford University result show 81.7%.

5 comentarios:

Unknown27 de noviembre de 2012 a las 7:51
I like what you did. But a good question is did you train tesseract with icdar data? and if so how?.If you just used tesseract directly without training, will that be a fair comparison? Thank you in advance.
Ismail
ResponderEliminar
Respuestas
Syed9 de agosto de 2025 a las 19:00
The OCR module is essential for text recognition tasks in many applications. If your device needs service, consider this reliable iPhone repair Dubai center.
ResponderEliminar
Respuestas
Syed9 de agosto de 2025 a las 19:00
The OCR module is essential for converting images to text efficiently. If your car battery dies unexpectedly, rely on this expert battery jump start service in Dubai.
ResponderEliminar
Respuestas

Añadir comentario