Optical Character Recognition (OCR)

LEADTOOLS' OCR features allow you to perform optical character recognition and turn images into documents. The OCR engine is based upon the time-tested industry standard Nuance ScanSoft Capture Development System v12 engine. Besides recognition, it allows all 150+ LEAD-supported File formats as an input. Recognized text can be exported to more than 40 different formats, including MS Word, MS Excel, Dbase and WordPerfect. LEADTOOLS OCR features contain superior OCR processing speeds, for use in form recognition and processing applications. Support for OCR features is included in the LEADTOOLS Document Imaging Suite, and can be added as a plug-in to the LEADTOOLS Document Imaging toolkit.

Features include preset confidence and accuracy levels for controlling how sensitive the engine is to unrecognized text, artificial intelligence for improving recognition on documents of the same type, and built-in and user-defined lexicons for limiting the type of text to recognize within a particular zone. LEADTOOLS provides the ability to verify or correct text during recognition. The OCR engine can perform Automatic area segmentation creating multi-layered zones, recognizing areas such as tables, rules, images and text. Or, you can manually designate up to 250 such zones.

Different fonts, sizes (5 to 72 point) and styles are also supported. Fax, dot matrix and halftones can be preprocessed to improve recognition results. The OCR Engine supports major European and Scandinavian languages (Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish, and Swedish) as well as English. Support for dialects such as US. English, French Canadian, Latin American Spanish, Swiss German, and Brazilian Portuguese is also provided.

An Overview of Recognition Modules

OCR Key Features

OCR Function References and Examples

OCR Tutorials

Intelligent Character Recognition (ICR)

OMR

PDF OCR Plug-in