An Overview of Recognition Modules

LEADTOOLS OCR Module - OmniPage Engine

Automatic recognition module

If the automatic recognition module is used, the engine will try to automatically select the most suitable recognition module for the zone. This is determined just before recognition, according to the zone's filling method and, if necessary, other settings, most typically the Character Set.

MTX (Mtext) omnifont recognition module

The MTX module recognizes machine printed text from printed publications, laser or ink-jet printers, and electric typewriters. Output from mechanical typewriters in good condition, and from draft-quality, letter-quality, or near-letter quality dot-matrix printers is also acceptable.

MOR (Multi-lingual Omnifont Recognition) module

The MOR module recognizes machine printed text from printed publications, laser or ink-jet printers, and electric typewriters. Output from mechanical typewriters in good condition, and from letter- or near-letter quality (LQ, NLQ) dot-matrix printers is also acceptable.

DOT Matrix (DOT 9-pin draft dot-matrix recognition module)

The Dot Matrix module is designed for ONLY draft-quality 9-pin dot-matrix texts.

For NLQ or LQ texts, the RECOGMODULE_OMNIFONT_PLUS2W, RECOGMODULE_OMNIFONT_PLUS3W, RECOGMODULE_MTEXT_OMNIFONT or RECOGMODULE_MULTI_LINGUAL_OMNIFONT modules are likely to give better results.

If FILL_DRAFTDOT9 filling method is set together with RECOGMODULE_AUTO, RECOGMODULE_MTEXT_OMNIFONT is used, provided that all characters (or languages or filters) validated for the zone are supported by it. If any are not supported, this module is used.

OMR optical mark recognition module

For more information see the LEAD OMR Overview.

ICR-HNR hand-printed numeral recognition module

The ICR-HNR module is used to recognize hand-printed numerals and four additional signs. If more hand-printed characters are to be recognized, it is best to use the DOC2_RECOGMODULE_RER_PRINTED recognition module instead.

This recognition module can recognize the following hand-printed characters:

Use the DOC2_ZONE_CHAR_FILTER_DIGIT filter to exclude the last four characters. The DOC2_ZONE_CHAR_FILTER_PUNCTUATION, DOC2_ZONE_CHAR_FILTER_MISCELLANEOUS filters, and other filters have no effect.

✎ NOTE

Be sure to set the filter to the DOC2_ZONE_CHAR_FILTER_DIGIT enumerated value in DOC2_CHAR_FILTER whenever you are using the OmniPage engine to recognize ICR numeric characters. Other values are not recommended for this situation.

ICR-RER hand-printed recognition module

The ICR-RER module is a third-party recognition module from reRecognition GmbH, Germany. The engine contains its recognition engine version 4.2f.

This recognition module can be used for recognition of hand-printed alphanumerical characters, i.e. upper and lower case letters, the digits and some others. Although it can be used to read flowing text, its main application area is in form-like situations, where the form designer has great control over the content and maybe length of hand-printed information given in each zone.

MAT matrix matching recognition module

The MAT module is designed to read certain groups of fixed-font characters specially designed for OCR or imaging applications, in which no two characters have similar shapes. Each character group has its own filling method. Application areas are in banking, check or waybill handling, product distribution and document validation, where high accuracy can be vital. It also handles some non-fixed print styles.

DOC2_FILL_OCRA

OCR-A. Uppercase English letters (26), digits, some punctuation and 3 special OCR-A symbols:

OCR Chair
(OCR Chair)
OCR Hook
(OCR Hook)
OCR OCRFork
(OCR OCRFork)

DOC2_FILL_OCRB

OCR-B. Uppercase English letters (26), digits and some punctuation.

DOC2_FILL_MICR

MICR (E-13B). Digits plus some punctuation and 4 special MICR symbols:

OCR Branch Bank
(OCR Branch Bank)
OCR Branch Bank
(OCR Amount of Check)
OCR Branch Bank
(OCR Dash)
OCR Branch Bank
(OCR Customer Account Number)

DOC2_FILL_DOTDIGIT

Ten digits only and the period. Commas are also read, but converted to periods. Though this is in theory a fixed-font, many variants of it are used.

DOC2_FILL_DASHDIGIT

Ten digits only and the period. Commas are also read, but converted to periods. Though this is in theory a fixed-font, many variants of it are used.

FRX (FireWorX) multi-lingual omnifont recognition module

The FRX module recognizes machine printed text from printed publications, laser or ink-jet printers, and electric typewriters. Output from mechanical typewriters in good condition, and from letter- or near-letter quality (LQ, NLQ) dot-matrix printers is also acceptable.

PLUS2W and PLUS3W omnifont recognition modules

The PLUS modules recognize machine printed text from printed publications, laser or ink-jet printers and electric typewriters. Output from mechanical typewriters in good condition may also be acceptable.

With any of these two voting modules, the accuracy is considerably better, but the recognition may need significantly more time than any single module.

✎ NOTE

The following table shows the text recognition module support for each of the 119 languages (General for OmniPage engine):

Language MOR MTX FRX PLUS2W PLUS3W DOT RER
Afrikaans Yes No Yes Yes Yes Yes Yes
Albanian Yes No Yes Yes Yes Yes Yes
Aymara Yes No Yes Yes Yes Yes Yes
Basque Yes No Yes Yes Yes Yes Yes
Bemba Yes Yes No Yes Yes Yes Yes
Blackfoot Yes Yes No Yes Yes Yes Yes
Brazilian Yes Yes Yes Yes Yes Yes Yes
Breton Yes No Yes Yes Yes Yes Yes
Bugotu Yes Yes No Yes Yes Yes Yes
Bulgarian Yes No Yes Yes Yes No No
Byelorussian Yes No Yes Yes Yes No No
Catalan Yes No Yes Yes Yes Yes Yes
Chamorro Yes No No Yes Yes Yes Yes
Chechen Yes No No Yes Yes Yes Yes
Corsican Yes No No Yes Yes Yes Yes
Croatian Yes No Yes Yes Yes No Yes
Crow Yes Yes No Yes Yes Yes Yes
Czech Yes Yes No Yes Yes No Yes
Danish Yes Yes Yes Yes Yes Yes Yes
Dutch Yes Yes Yes Yes Yes Yes Yes
English Yes Yes Yes Yes Yes Yes Yes
Eskimo (Inuit) Yes No Yes Yes Yes No Yes
Esperanto Yes No No Yes Yes No No
Estonian Yes Yes No Yes Yes Yes Yes
Faroese Yes No Yes Yes Yes No No
Fijian Yes No No Yes Yes No Yes
Finnish Yes Yes Yes Yes Yes Yes Yes
French Yes Yes Yes Yes Yes Yes Yes
Frisian Yes No Yes Yes Yes Yes Yes
Friulian Yes No Yes Yes Yes Yes Yes
Gaelic (Irish) Yes No Yes Yes Yes Yes Yes
Gaelic (Scottish) Yes No Yes Yes Yes Yes Yes
Galician Yes Yes Yes Yes Yes Yes Yes
Ganda Yes No No Yes Yes No Yes
German Yes Yes Yes Yes Yes Yes Yes
Greek Yes No Yes Yes Yes Yes No
Guarani Yes No No Yes Yes Yes Yes
Hani Yes Yes No Yes Yes Yes Yes
Hawaiian Yes Yes Yes Yes Yes Yes Yes
Hungarian Yes No Yes Yes Yes Yes Yes
Icelandic Yes No Yes Yes Yes No No
Ido Yes Yes No Yes Yes Yes Yes
Indonesian Yes Yes Yes Yes Yes Yes Yes
Interlingua Yes Yes No Yes Yes Yes Yes
Italian Yes Yes Yes Yes Yes Yes Yes
Kabardian Yes No No Yes Yes No No
Kasub Yes No No Yes Yes No Yes
Kawa Yes Yes No Yes Yes Yes Yes
Kikuyu Yes No No Yes Yes No No
Kongo Yes Yes No Yes Yes Yes Yes
Kpelle Yes Yes No Yes Yes Yes Yes
Kurdish Yes No Yes Yes Yes No Yes
Latin Yes Yes Yes Yes Yes Yes Yes
Latvian Yes No Yes Yes Yes No Yes
Lithuanian Yes No Yes Yes Yes No Yes
Luba Yes No No Yes Yes No Yes
Luxembourgian Yes No No Yes Yes Yes Yes
Macedonian Yes No Yes Yes Yes No No
Malagasy Yes Yes No Yes Yes Yes Yes
Malay Yes No Yes Yes Yes No Yes
Malinke Yes No No Yes Yes Yes Yes
Maltese Yes No No Yes Yes No No
Maori Yes Yes No Yes Yes Yes Yes
Mayan Yes No No Yes Yes Yes Yes
Miao Yes Yes No Yes Yes Yes Yes
Minankabaw Yes No No Yes Yes No Yes
Mohawk Yes Yes No Yes Yes Yes Yes
Moldavian Yes No No Yes Yes No No
Nahuatl Yes Yes No Yes Yes Yes Yes
Norwegian Yes Yes Yes Yes Yes Yes Yes
Nyanja Yes Yes No Yes Yes Yes Yes
Occidental Yes No No Yes Yes Yes Yes
Ojibway Yes No No Yes Yes No Yes
Papiamento Yes No No Yes Yes Yes Yes
Pigin English Yes Yes Yes Yes Yes Yes Yes
Polish Yes No Yes Yes Yes No Yes
Portuguese Yes Yes Yes Yes Yes Yes Yes
Provenal Yes No No Yes Yes Yes Yes
Quechua Yes No No Yes Yes Yes Yes
Rhaetic Yes No No Yes Yes Yes Yes
Romanian Yes No Yes Yes Yes No No
Romany Yes No No Yes Yes No Yes
Ruanda Yes Yes No Yes Yes Yes Yes
Rundi Yes Yes No Yes Yes Yes Yes
Russian Yes No Yes Yes Yes No No
Sami Yes No No Yes Yes No Yes
Sami, Lule Yes No No Yes Yes No Yes
Sami, Northern Yes No No Yes Yes No Yes
Sami, Southern Yes No No Yes Yes No Yes
Samoan Yes No No Yes Yes Yes Yes
Sardinian Yes No No Yes Yes Yes Yes
Serbian Yes No Yes Yes Yes No No
Serbian, Latinic Yes No Yes Yes Yes No Yes
Shona Yes Yes No Yes Yes Yes Yes
Sioux Yes Yes No Yes Yes Yes Yes
Slovak Yes No Yes Yes Yes No Yes
Slovenian Yes No Yes Yes Yes No Yes
Somali Yes Yes No Yes Yes Yes Yes
Sorbian (Wend) Yes No Yes Yes Yes no Yes
Sotho Yes No No Yes Yes Yes Yes
Spanish Yes Yes Yes Yes Yes Yes Yes
Sundanese Yes No No Yes Yes Yes Yes
Swahili Yes Yes Yes Yes Yes Yes Yes
Swazi Yes No No Yes Yes No Yes
Swedish Yes Yes Yes Yes Yes Yes Yes
Tagalog Yes Yes No Yes Yes Yes Yes
Tahitian Yes No Yes Yes Yes Yes Yes
Tinpo Yes Yes No Yes Yes Yes Yes
Tongan Yes Yes No Yes Yes Yes Yes
Tswana (Chuana) Yes No No Yes Yes Yes Yes
Tun Yes Yes No Yes Yes Yes Yes
Turkish Yes No Yes Yes Yes No Yes
Ukrainian Yes No Yes Yes Yes No No
Visayan Yes Yes No Yes Yes Yes Yes
Welsh Yes No Yes Yes Yes Yes Yes
Wolof Yes No No Yes Yes Yes Yes
Xhosa Yes Yes No Yes Yes Yes Yes
Zapotec Yes Yes No Yes Yes Yes Yes
Zulu Yes No Yes Yes Yes No Yes
Help Version 21.0.2021.1.7
Products | Support | Contact Us | Intellectual Property Notices
© 1991-2021 LEAD Technologies, Inc. All Rights Reserved.

LEADTOOLS OCR Module - OmniPage Engine C API Help