Leadtools.Forms.DocumentReaders Namespace : DocumentObjectManager Class |
public class DocumentObjectManager
'Declaration Public Class DocumentObjectManager
'Usage Dim instance As DocumentObjectManager
public ref class DocumentObjectManager
Instances of the DocumentObjectManager aren't created directly. Instead, access the instance that is automatically created inside a DocumentReader using the DocumentReader.ObjectManager property.
The DocumentObjectManager contains the ParsePageText method that parses the text of the document. Use the ParsePageText in a loop to parse the text of all the pages in a document. However, you must call BeginParse before parsing starts and EndParse when parsing is finished.
The various document readers will parse text differently. Currently, LEADTOOLS ships with the following document readers:
DocumentReaderType.Pdf: This the document reader responsible for parsing PDF documents. PDF document text is parsed without the need of an OCR engine.
DocumentReaderType.Xps: This the document reader responsible for parsing XPS documents. XPS document text is parsed without the need of an OCR engine.
DocumentReaderType.Raster: This the document reader responsible for parsing everything else. An OCR engine is required to parse the text of the document (by passing a started object of type Leadtools.Forms.Ocr.IOcrEngine to BeginParse.
LEADTOOLS will add more document readers and functionality in the near future for document types such as DICOM, DOC/DOCX(2007/2010), XLS/XLSX(2007/2010) and RTF.
More objects types such as images, bookmarks, hyperlinks and annotations will also be added in the near future.
For an example, refer to DocumentReader.