OCR Languages and Spell Checking

Show in webframe

The LEADTOOLS .NET OCR Toolkit supports languages and spell checking through the following, separate parts:

The Language Environment

The language environment defines the character set(s) recognized by the OCR engine. For example, if you enable the English and German languages, the German characters (ä, Ä, é, ö, Ö, ü, Ü, ß) will be combined with the English characters to define the set recognized by the engine.

To set the character sets to use in the engine, use the IOcrLanguageManager.EnableLanguages method. To get the character sets supported by the engine, use the IOcrLanguageManager.GetSupportedLanguages and IOcrLanguageManager.IsLanguageSupported methods. You can enable as many character sets as required.

The language environment does not automatically perform spell-checking. To enable it you need to use the spell-checking sub-system.

Spell Checking Sub System

The functionality of the checking subsystem consists of three separate parts:

LEADTOOLS OCR supports spell checking and correction through the use of external dictionaries. The value of IOcrSpellCheckManager.SpellCheckEngine acts as a global switch to use a particular spell checker or turn spell checking off.

When you set the IOcrSpellCheckManager.SpellCheckEngine property to a value other than None, the OCR engine will automatically try to load the spell checker requested and queries the language dictionaries found on your machine. You can change Leadtools.Forms.Ocr.IOcrSpellCheckManager.SpellCheckEngine at any time during the life of the Leadtools.Forms.Ocr.IOcrEngine depending on your application needs. For example, to disable spell checking while recognition certain types of documents only and then re-enable it for other types.

To query the languages that support a dictionary in an engine, use IOcrSpellCheckManager.GetSupportedSpellLanguages. You can use one language dictionary at a time inside the engine.

Language Character Sets Supported by Engine

For more information, refer to IOcrLanguageManager.

Advantage OCR Engine

English (en) Spanish (es) French (fr) German (de)
Italian (it) Bulgarian (bg) Catalan (ca) Czech (cs)
Danish (da) Greek (el) Finnish (fi) Hungarian (hu)
Indonesian (id) Lithuanian (lt) Latvian (lv) Dutch (nl)
Norwegian (no) Polish (pl) Portuguese (pt) Romanian (ro)
Russian (ru) Slovak (sk) Slovenian (sl) Serbian (sr)
Swedish (sv) Turkish (tr) Ukrainian (uk) Vietnamese (vi)
Japanese (ja) Korean (ko) Chinese Simplified (zh-Hans) Chinese Traditional (zh-Hant)

Plus OCR Engine

English (en) Afrikaans (af) Albanian (sq) Basque (eu)
Belarusian (be) Bulgarian (bg) Catalan (ca) Croatian (hr)
Czech (cs) Danish (da) Dutch (nl) Estonian (et)
Faroese (fo) Finnish (fi) French (fr) Galician (gl)
German (de) Greek (el) Hungarian (hu) Icelandic (is)
Indonesian (id) Italian (it) Latvian (lv) Lithuanian (lt)
Macedonian (mk) Norwegian (no) Polish (pl) Portuguese (pt)
Portuguese Brazil (pt-BR), Romanian (ro) Russian (ru) Serbian (sr)
Serbian Cyrillic (sr-Cyrl-CS) Slovak (sk) Slovenian (sl) Spanish (es)
Swedish (sv) Turkish (tr) Ukrainian (uk)

Professional Engine

English (en) Afrikaans (af) Albanian (sq) Basque (eu)
Belarusian (be) Bulgarian (bg) Catalan (ca) Croatian (hr)
Czech (cs) Danish (da) Dutch (nl) Estonian (et)
Faroese (fo) Finnish (fi) French (fr) Galician (gl)
German (de) Greek (el) Hungarian (hu) Icelandic (is)
Indonesian (id) Italian (it) Latvian (lv) Lithuanian (lt)
Macedonian (mk) Norwegian (no) Polish (pl) Portuguese (pt)
Portuguese Brazil (pt-BR), Romanian (ro) Russian (ru) Serbian (sr)
Serbian Cyrillic (sr-Cyrl-CS) Slovak (sk) Slovenian (sl) Spanish (es)
Swedish (sv) Turkish (tr) Ukrainian (uk)

And the following Asian character sets (available with the Asian OCR Module):

Chinese Simplified (zh-Hans) Chinese Traditional (zh-Hant) Japanese (ja) Korean (ko)

Arabic OCR Engine

Arabic (ar)

Language Dictionaries Supported by Engine

For more information, refer to IOcrSpellCheckManager.

Advantage OCR Engine

English (en) Spanish (es) French (fr) German (de)
Italian (it) Bulgarian (bg) Catalan (ca) Czech (cs)
Danish (da) Greek (el) Finnish (fi) Hungarian (hu)
Indonesian (id) Lithuanian (lt) Latvian (lv) Dutch (nl)
Norwegian (no) Polish (pl) Portuguese (pt) Romanian (ro)
Russian (ru) Slovak (sk) Slovenian (sl) Serbian (sr)
Swedish (sv) Turkish (tr) Ukrainian (uk) Vietnamese (vi)
Japanese (ja) Korean (ko) Chinese Simplified (zh-Hans) Chinese Traditional (zh-Hant)

For more information, refer to Leadtools.Forms.Ocr.OcrSpellCheckEngine.

Plus OCR Engine

English (en) Catalan (ca) Czech (cs) Danish (da)
Dutch (nl) Finnish (fi) French (fr) German (de)
Greek (el) Hungarian (hu) Italian (it) Norwegian (no)
Polish (pl) Portuguese (pt) Russian (ru) Spanish (es)
Swedish (sv)

Professional OCR Engine

English (en) Catalan (ca) Czech (cs) Danish (da)
Dutch (nl) Finnish (fi) French (fr) German (de)
Greek (el) Hungarian (hu) Italian (it) Norwegian (no)
Polish (pl) Portuguese (pt) Portuguese Brazil (pt-BR) Russian (ru)
Slovenian (sl) Spanish (es) Swedish (sv) Turkish (tr)

Arabic OCR Engine

This feature is not yet supported for the Arabic OCR engine.


Products | Support | Contact Us | Copyright Notices
© 2006-2014 All Rights Reserved. LEAD Technologies, Inc.