Welcome Guest! To enable all features, please Login or Register.

Notification

Icon
Error

Options
View
Last Go to last post Unread Go to first unread post
#1 Posted : Friday, February 15, 2008 6:00:02 AM(UTC)

mattie  
mattie

Groups: Registered
Posts: 37


Hi,

We generate some documents ourselves on which we provide a code which enables us to classify the document afterwards (we don't use barcodes as these seem to take more place and less information on a restricted area of the page). Could you suggest which font face / size we use best then to have optimal recognition when scanning @ 200 dpi?

thanks
matthias
 

Try the latest version of LEADTOOLS for free for 60 days by downloading the evaluation: https://www.leadtools.com/downloads

Wanna join the discussion? Login to your LEADTOOLS Support accountor Register a new forum account.

#2 Posted : Sunday, February 17, 2008 12:00:17 AM(UTC)
Maen Hasan

Groups: Registered, Tech Support
Posts: 1,326

Was thanked: 1 time(s) in 1 post(s)

Hello,

To get good OCR results, the text must be clear and readable. Also, the size of the image should be A3 or smaller.

Thanks,
Maen Badwan
LEADTOOLS Technical Support
 
#3 Posted : Monday, February 18, 2008 12:00:04 AM(UTC)

mattie  
mattie

Groups: Registered
Posts: 37


You mean there is absolutely no difference between Arial 20pt and Times New Roman 8pt? kinda hard to believe ;) There must exist an optimal, or at least, a minimum font size with maximum performance. Also, I guess there must be a difference in recognition between serif and sans serif fonts?
 
#4 Posted : Monday, February 18, 2008 3:23:51 AM(UTC)
Maen Hasan

Groups: Registered, Tech Support
Posts: 1,326

Was thanked: 1 time(s) in 1 post(s)

Hello,

The toolkit and OCR engine documentation do not contain published information about a minimum font size, or specific font type for optimal recognition. Of course the larger the font and the higher the resolution, the better OCR results you will get.

I suggest you do your own testing using the restricted area in your documents to see what type of codes to use.
I also recommend using a lossless compression format, such as CCITT G4 or JBIG if your images are black and white.

Thanks,
Maen Badwan
LEADTOOLS Technical Support
 
#5 Posted : Monday, February 18, 2008 3:40:36 AM(UTC)

mattie  
mattie

Groups: Registered
Posts: 37


we are scanning 200dpi A4 sheets in color and saving to JPEG (no negotiation there) and since we are still in the design stage of the forms we can choose the place and size of the form identifying string we want to recognize. Of course, the smaller this area, the better because then there is more space left for form input. I feel a bit hesitant to rely on the results of a rather limited test set I am able to generate and test myself since I cannot be sure if I was lucky or not in my specific case to have a correct recognition. But thanks anyway :)

 
#6 Posted : Monday, February 18, 2008 4:57:34 AM(UTC)
Maen Hasan

Groups: Registered, Tech Support
Posts: 1,326

Was thanked: 1 time(s) in 1 post(s)

Hello,

Using JPEG is a problem in itself.
There are 2 groups of JPEG formats, one of them is lossy and the other is lossless.

The problem with lossy JPEG is that it changes the pixel data every time you load and save. This could lead to more recognition errors when using OCR.

The problem with lossless JPEG is that very few non-LEAD applications understand it, so unless your images will only be viewed by your own application (or another application built with LEADTOOLS), you should generally avoid it.

Is it possible to save the data you need in the JPEG comments instead of writing it on the image pixel data?

Thanks,
Maen Badwan
LEADTOOLS Technical Support
 
#7 Posted : Sunday, June 20, 2010 9:23:25 PM(UTC)

Bhavik  
Bhavik

Groups: Registered
Posts: 2


Hello,

did u get any standard font size and face as I am working on OCR for one of my utility and williing to know if LEADTOOL OCR is performing well using specific font.
 
#8 Posted : Monday, June 21, 2010 2:01:20 AM(UTC)

mattie  
mattie

Groups: Registered
Posts: 37


Hi Bhavik,

we ended up using the OCR-A font [1] which claims to be designed specifically for OCR processing.
However, it turned out this only works well if the zoning engine creates a separate zone for this text field which does not always happen, i.e. when the text is aligned with other text (in another font).

So, the best recommendation I can give is: use the same font as any other text on the form. There are certain limitations on the font size, but I don't remember them off the top of my head.
As long as you stay within 'normal' size ranges (> 10) and standard font faces, you should be safe.

HTH
matthias

[1] http://en.wikipedia.org/wiki/OCR-A_font
 
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

Powered by YAF.NET | YAF.NET © 2003-2024, Yet Another Forum.NET
This page was generated in 0.131 seconds.