Welcome Guest! To enable all features, please Login or Register.

Notification

Icon
Error

Options
View
Last Go to last post Unread Go to first unread post
#1 Posted : Monday, February 25, 2008 12:44:16 PM(UTC)

Gutek  
Gutek

Groups: Registered
Posts: 5


Hello all.
I have just started discovering the Leadtool OCR utility, so far with the OCR_Util application.  I have some observations concerning it:
1. I have noticed that it is impossible to recognize a light text on a dark background. For instance inverting colors in one of the example ocr images makes the text unrecognizable. Can this be alleviated by some parameter settings? If not, what can be done about it?
2. Sometimes, when loading an image there is an error: "Can't add page to engine, Error = -1239", which according to documentation means "Non-supported resolution". The image was a bitmap, size 400 x 300, with a single sentence in black on white background , font size about 24 pt. Curiously, when the .bmp file was changed to .jpg, there was no problem in loading it.
What are possible reason of such an error?

Best regards,
Gutek
 

Try the latest version of LEADTOOLS for free for 60 days by downloading the evaluation: https://www.leadtools.com/downloads

Wanna join the discussion? Login to your LEADTOOLS Support accountor Register a new forum account.

#2 Posted : Tuesday, February 26, 2008 4:18:24 AM(UTC)

Qasem Lubani  
Guest

Groups: Guests
Posts: 3,022

Was thanked: 2 time(s) in 2 post(s)

You are correct. Our OCR engine can only read dark characters on a light background.

About the second issue, please send me the image in a ZIP or RAR file and I will test it for you. You can either post the files here or send them to me to [email protected]
 
#3 Posted : Tuesday, February 26, 2008 7:37:31 AM(UTC)

Gutek  
Gutek

Groups: Registered
Posts: 5


Hello,
Thanks for reply. I attach the bitmap.

Concerning the background issue, what can be done if we do not know a priori what kind of contrast between the text and the background will be? For instance, we can expect a yellow text on a blue bground or some other combination (dark bg possible) - easy to distinguish for an eye, but more difficult for an OCR engine?
File Attachment(s):
bialy43.zip (5kb) downloaded 26 time(s).
 
#4 Posted : Wednesday, February 27, 2008 5:16:26 AM(UTC)

Qasem Lubani  
Guest

Groups: Guests
Posts: 3,022

Was thanked: 2 time(s) in 2 post(s)


First of all, I'm sorry I gave you an incorrect answer.


I tested some more and found
out that the engine can actually recognize text from inverted parts. I'm
attaching a sample TIFF that I tested with and worked. If you have sample
images that don't work the same way, also send them over and we will check them for you.


About the image you sent, the resolution stored in it
is 72 DPI. This is very low. I changed it and saved the images back, it seems to be working. I'm also attaching the modified image.

File Attachment(s):
bialy43.zip (322kb) downloaded 31 time(s).
 
#5 Posted : Saturday, March 1, 2008 9:56:59 AM(UTC)

Gutek  
Gutek

Groups: Registered
Posts: 5


Hi again,
The new resolution image can be recognized. However, I wonder what difference does the resolution make if we are working with digital data? The old one (72dpi) and the new one (150 dpi) are in fact the same. I believe that dpi indication is only suitable for printing - are there any reasons to take it into consideration in OCR?

As far as white text on black bg is concerned, the inverted version of the image you sent me (150dpi) can't be recognized (attached). The Licence Agreement sample with inverted paragraph could be recognized though.

It seems that light on dark recognition, even if possible, is not reliable. For a tool, which should work automatically for different kinds of text without human surveillance, I think it is better to perform some image preprocessing (like invertion, contrast enhancement) so to enable recognition.

Best regards,
Gutek
File Attachment(s):
bialy43NewResInv.zip (5kb) downloaded 25 time(s).
 
#6 Posted : Sunday, March 2, 2008 6:29:00 AM(UTC)

Qasem Lubani  
Guest

Groups: Guests
Posts: 3,022

Was thanked: 2 time(s) in 2 post(s)

I'm afraid that the engine will not work with these type of images and there is little that we can do about it. However, Since the inverted image does work and produces correct results, you can use LEADTOOLS to invert images and to check if the image is inverted or not using some image processing functions.
 
#7 Posted : Sunday, March 2, 2008 10:48:35 AM(UTC)

Gutek  
Gutek

Groups: Registered
Posts: 5


Yes, that's what I was going to do.

What about the "high dpi" issue? Are there any reasons for not allowing low dpi images to be loaded?

Gutek
 
#8 Posted : Monday, March 3, 2008 5:38:27 AM(UTC)

Qasem Lubani  
Guest

Groups: Guests
Posts: 3,022

Was thanked: 2 time(s) in 2 post(s)


Gutek,


The OCR engine internally uses the DPI information
when it attempts to figure out what characters correspond to the shapes in the image. It has a limitation of not properly working with low resolution.
 
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

Powered by YAF.NET | YAF.NET © 2003-2025, Yet Another Forum.NET
This page was generated in 0.171 seconds.