Leadtools.Forms.Ocr Requires Document/Medical product license | Send comments on this topic. | Back to Introduction - All Topics | Help Version 16.5.9.25
OcrDocumentFormat Enumeration
See Also  
Leadtools.Forms.Ocr Namespace : OcrDocumentFormat Enumeration



(Deprecated) The document formats supported by LEADTOOLS OCR toolkit.

Syntax

Visual Basic (Declaration) 
<ObsoleteAttribute("Use Leadtools.Forms.DocumentWriters.DocumentFormat instead")>
<SerializableAttribute()>
Public Enum OcrDocumentFormat 
   Inherits Enum
Visual Basic (Usage)Copy Code
Dim instance As OcrDocumentFormat
C# 
[ObsoleteAttribute("Use Leadtools.Forms.DocumentWriters.DocumentFormat instead")]
[SerializableAttribute()]
public enum OcrDocumentFormat : Enum 
C++/CLI 
[ObsoleteAttribute("Use Leadtools.Forms.DocumentWriters.DocumentFormat instead")]
[SerializableAttribute()]
public enum class OcrDocumentFormat : public Enum 

Members

MemberDescription
AsciiTextASCII Text. This is the most basic format and the document will be a text file with line break after each line. If table is present, its cells are positioned by tabs. The text returned by RecognizeText uses this format.

Note: Use DocumentFormat.Text instead.

AsciiTextLayoutRetainedASCII Text output, layout retention with mimicked spaces. Line/cell contents are surrounded by quotes ("").

Note: Use DocumentFormat.Text instead.

AsciiTextCommaDelimitedASCII Comma delimited text output. Line/cell contents are surrounded by quotes ("").

Note: Use DocumentFormat.Text instead.

AsciiTextFormattedASCII Text output allowing quick text conversion. Line break after each line and after each zone.

Note: Use DocumentFormat.Text instead.

UnicodeTextUNICODE Text with line break after each line. If a table is present, its cells are positioned by tabs.

Note: Use DocumentFormat.Text instead.

UnicodeTextLayoutRetainedUNICODE Text output, layout retention with mimicked spaces. Line/cell contents are surrounded by quotes ("").

Note: Use DocumentFormat.Text instead.

UnicodeTextCommaDelimitedUNICODE Text with line break after each line. If table is present, its cells are positioned by tabs.

Note: Use DocumentFormat.Text instead.

UnicodeTextFormattedUNICODE Text output allowing quick text conversion. Line break after each line and after each zone.

Note: Use DocumentFormat.Text instead.

Html32HTML output. HTML 3.2 is useful to export with partial formating. The output files support all major browsers.

Note: Use DocumentFormat.Html instead.

Html40HTML output.HTML 4.0 can set the exact position/size of objects, use this output format with full formatting.

Note: Use DocumentFormat.Html instead.

Word97Microsoft Word 97 (doc) output format.

Note: Use DocumentFormat.Doc instead.

Word2000Microsoft Word 2000 (doc) output format.

Note: Use DocumentFormat.Doc instead.

Word2003Microsoft Word 2003 (doc) output format.

Note: Use DocumentFormat.Doc instead.

WordMLMicrosoft Office Open XML (docx) output format.

Note: The LEADTOOLS Document Writers does not currently support an equivalent to this format.

Excel97Microsoft Excel 97 (xls) output format.

Note: The LEADTOOLS Document Writers does not currently support an equivalent to this format.

Excel2000Microsoft Excel 2000 (xls) output format.

Note: The LEADTOOLS Document Writers does not currently support an equivalent to this format.

RtfRich Text Format for Word 97 and later.

Note: Use DocumentFormat.Rtf instead.

RtfWordPadRich Text Format for Microsoft WordPad.

Note: Use DocumentFormat.Rtf instead.

InfoPathMicrosoft InfoPath XML document output format.

Note: The LEADTOOLS Document Writers does not currently support an equivalent to this format.

PdfAdobe PDF. Displaying the generated PDF file in a PDF-reader results in a very similar look to the original document. The text can be searched. The PDF file contains the recognized characters in the same positions as in the original. The original page image is overlaid on top of the PDF document.

Note: Use DocumentFormat.Pdf instead.

PdfImageAdobe PDF with raster image only.

Note: Use DocumentFormat.Pdf instead.

PdfTextAdobe PDF with text only. The text can be searched. The PDF file contains the recognized characters in the same positions as in the original. The original page image is not overlayed ontop of the PDF document.

Note: Use DocumentFormat.Pdf instead.

PdfEditedAdobe PDF with text and image. Use this format if you have used IOcrPage.SetRecognizedCharacters to insert or delete characters in the recognized data. The engine will re-arrange the character boxes before saving the result PDF file.

Note: Use DocumentFormat.Pdf instead.

PdfWithImageSubstitutesAdobe PDF with text only. Missing and rejected characters are replaced by small images from the original page resulting in a better looking document than PdfText. The text can be searched. The PDF file contains the recognized characters in the same positions as in the original.

Note: Use DocumentFormat.Pdf instead.

PdfAAdobe PDF/A format. The original page image is overlaid on top of the PDF document. Optimized for the long-term archiving of electronic documents and is based on the PDF Reference Version 1.4 from Adobe Systems Inc. (implemented in Adobe Acrobat 5).

Note: Use DocumentFormat.Pdf instead.

PdfATextAdobe PDF/A format with text only. Optimized for the long-term archiving of electronic documents and is based on the PDF Reference Version 1.4 from Adobe Systems Inc. (implemented in Adobe Acrobat 5).

Note: Use DocumentFormat.Pdf instead.

Example

For an example, refer to IOcrDocument, IOcrDocumentManager and IOcrEngine.

Remarks

(Deprecated) All formats supported by Leadtools.Forms.DocumentWriters can be used from OCR now. For a list of the formats supported by LEADTOOLS OCR, refer to DocumentFormat. To get the engine native formats (if any), use GetEngineSupportedFormats.

The IOcrDocument interface contains the IOcrDocument.Save methods which allow you to save the recognized pages data to a final document format such as PDF, DOC and HTML (or XML through IOcrDocument.SaveXml).

Not all of the formats are supported by an IOcrEngine. To get the formats supported by a particular engine, use the IOcrDocumentManager.GetSupportedFormats or IOcrDocumentManager.IsFormatSupported methods.

To get the file extension for a OcrDocumentFormat, use IOcrDocumentManager.GetFormatFileExtension.

To get the friendly name of a OcrDocumentFormat, use IOcrDocumentManager.GetFormatFriendlyName.

Some of the document formats requires a special key to unlock. When using these formats you have to first unlock the specified support using the RasterSupport class.

The following table lists the document formats and the support type required to be unlocked before using them:
Document FormatSupport Type
Pdf, PdfImage, PdfText, PdfEdited and PdfWithImageSubstitutes RasterSupportType.OcrPlusPdfOutput when using the OcrEngineType.Plus engine, RasterSupportType.OcrProfessionalPdfOutput when using the OcrEngineType.Professional engine and RasterSupportType.OcrAdvantagePdfLeadOutput when using the OcrEngineType.Advantage engine
PdfA and PdfAText RasterSupportType.OcrPlusPdfLeadOutput when using the OcrEngineType.Plus engine, RasterSupportType.OcrProfessionalPdfLeadOutput when using the OcrEngineType.Professional engine and RasterSupportType.OcrAdvantagePdfLeadOutput when using the OcrEngineType.Advantage engine

Inheritance Hierarchy

System.Object
   System.ValueType
      System.Enum
         Leadtools.Forms.Ocr.OcrDocumentFormat

Requirements

Target Platforms: Microsoft .NET Framework 3.0, Windows XP, Windows Server 2003 family, Windows Server 2008 family

See Also

OcrDocumentFormat requires an OCR module license and unlock key. For more information, refer to: Imaging Pro/Document/Medical Features