←Select platform

DocumentTextImagesRecognitionMode Enumeration

Summary

Indicates how to treat the image elements encountered in the input SVG page during text extraction.

Syntax
C#
VB
C++
[SerializableAttribute()] 
[DataContractAttribute()] 
public enum DocumentTextImagesRecognitionMode 
<SerializableAttribute(),  
 DataContractAttribute()>  
Public Enum DocumentTextImagesRecognitionMode 
public: 
   [SerializableAttribute,  
   DataContractAttribute] 
   enum class DocumentTextImagesRecognitionMode sealed 

Members

ValueMemberDescription
0AutoUse SVG engine unless the page is all raster.
1DisabledDo not use OCR recognition for the image elements. Instead, ignore the image elements.
2AlwaysUse OCR recognition on the image elements. Add the recognition data to the final document page text with the rest of the other SVG elements of the page. Requires a valid IOcrEngine instance.

Remarks

Use DocumentTextImagesRecognitionMode to specify the type of DocumentText.ImagesRecognitionMode property to determine how image elements are treated during text extraction from an SVG page. This value has no effect on raster pages, and OCR is always used.

The following table helps determine what would occur during DocumentPage.GetText, depending on the type of the page:

Value Page Type Behavior
Auto SVG with only text or mixed image and text elements Only the text elements are extracted
Auto SVG with raster elements only The image elements are recognized and text extracted using the OCR engine
Disabled SVG with only text or mixed image and text elements Only the text elements are extracted
Disabled SVG with raster elements only No text is extracted
Always SVG with only text or mixed image and text elements The text elements are extracted and the image elements are recognized and text extracted using the OCR engine
Always SVG with raster elements only The image elements are recognized and text extracted using the OCR engine

The engine will use DocumentPage.IsSvgSupported and DocumentPage.IsSvgConversionPreferred, as well as checking the SVG of the page elements (returned by DocumentPage.GetSvg) to perform the actions described above.

When Always is used, a valid (started) IOcrEngine instance must set in DocumentText.OcrEngine.

When Auto is used, a valid (started) IOcrEngine instance should be set in DocumentText.OcrEngine. If this value is null, then the framework will behave as if Disabled was used.

Note: When using OcrEngineType.LEAD engine, DocumentPage.GetText will try to optimize the speed of OCR recognition for text format output (for instance, will not try to recognize the font decorations such as bold or italic). This is done by checking if Recognition.AutoRecognizeManager.FormatSpeedOptimized is true (the default value). This optimization may result in DocumentPage.GetText producing slightly different recognition on complex input raster images than IOcrPage.GetText, which does not use the value of the setting. Therefore, if producing the same exact results from the two methods is desired, set the value of the setting to false in the IOcrEngine used with the document. Refer to LEADTOOLS OCR Module - LEAD Engine Settings for more information.

Requirements

Target Platforms

See Also

Reference

Leadtools.Document Namespace

Help Version 20.0.2019.9.19
Products | Support | Contact Us | Intellectual Property Notices
© 1991-2019 LEAD Technologies, Inc. All Rights Reserved.

Leadtools.Document Assembly