LEADTOOLS OCR (Leadtools.Forms.Ocr assembly)
LEAD Technologies, Inc

IOcrAutoRecognizeManager Interface

Example 





Members 
Provides support for the one shot "fire and forget" approach to OCR suitable for unattended recognition. .NET support
Object Model
IOcrAutoRecognizeManager Interface
Syntax
public interface IOcrAutoRecognizeManager 
'Declaration
 
Public Interface IOcrAutoRecognizeManager 
'Usage
 
Dim instance As IOcrAutoRecognizeManager
public interface IOcrAutoRecognizeManager 
function Leadtools.Forms.Ocr.IOcrAutoRecognizeManager() 
public interface class IOcrAutoRecognizeManager 
Remarks

You can access the instance of the IOcrAutoRecognizeManager used by an IOcrEngine through the IOcrEngine.AutoRecognizeManager property.

The members of this interface will let you create a document from an image file on disk with optional progress and status monitors.

You can use the Run methods to convert in one line of code an image on disk to a final document with any of the document formats supported by this IOcrEngine.

You can also create jobs using the CreateJob method and then run them synchronously through RunJob or asynchronously through RunJobAsync.

The IOcrAutoRecognizeManager interface also has the following options to use with the Run, RunJob and RunJobAsync methods:

Member Description
MaximumPagesBeforeLtd

Add support for converting a document with unlimited number of pages. An OCR recognition operation on a document that contains a large amount of pages (10 and more) might result in an out of memory error.

All of the LEADTOOLS OCR engines supports saving the intermediate recognition results to a temporary LTD file (DocumentFormat.LTD). The result of subsequent pages will be appended to this temporary file. When all the pages of the document have been recognized, the engine will convert the temporary LTD file to the desired output format.

The MaximumPagesBeforeLtd property defines the maximum number of pages processed as a whole. For example, if the original document has 20 pages and the value of this property is 8, the engine will recognize the first 8 pages and saves the result to a temporary file, recognizes the second 8 pages and append the results, and finally, recognize the last 4 pages and convert the temporary document into the final format.

PreprocessPageCommands

Holds an array of OcrAutoPreprocessPageCommand items to control what auto-preprocess operation to perform on each page document prior to recognition.

MaximumThreadsPerJob

Maximum number of threads to use per job. You can instruct IOcrAutoRecognizeManager to use all available machine CPUs/cores when recognizing a document. This will greatly reduce the time required to finish the OCR operation.

JobErrorMode

Ability to resume on none critical errors. For example, if a source document has a page that could not be recognized. The offending page will be added to the final document as a graphics images and recognition will continue to the next page.

JobStarted, JobProgress, JobOperation and JobCompleted events

Events to track when both synchronous and asynchronous jobs has started, being run and completed.

AbortAllJobs

Aborts all running and pending jobs.

EnableTrace

Output debug messages to the standard .NET trace listeners.

Some OCR engine types support creating multi-threaded documents by creating one IOcrEngine and multiple IOcrDocument or IOcrAutoRecognizeJob each in its own dedicated threads. For more information, refer to Multi-Threading with LEADTOOLS OCR.

Example
Copy CodeCopy Code  
Private Shared Sub OcrAutoRecognizeManagerExample()
      Console.WriteLine("Preparing the source and destination directories...")
      Dim sourceDirectory As String = LEAD_VARS.ImagesDir
      Dim destinationDirectory As String = Path.Combine(LEAD_VARS.ImagesDir, "AutoRecognizeManagerExample")

      ' Prepare the output directory
      If Not Directory.Exists(destinationDirectory) Then
         Directory.CreateDirectory(destinationDirectory)
      End If

      ' OCR some images from the source directory into the destination directory:
      Dim imageFiles As New List(Of String)

      For i As Integer = 1 To 4
         imageFiles.Add(Path.Combine(sourceDirectory, String.Format("Ocr{0}.tif", i)))
      Next

      Console.WriteLine("Creating an instance of the engine...")

      ' Create an instance of the engine
      Using ocrEngine As IOcrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Professional, False)
         ' Start the engine using default parameters
         Console.WriteLine("Starting up the engine...")
         ocrEngine.Startup(Nothing, Nothing, Nothing, Nothing)

         Dim ocrAutoRecognizeManager As IOcrAutoRecognizeManager = ocrEngine.AutoRecognizeManager

         ' Use LTD as a temporary format if a document has more than 4 pages to save memory
         ocrAutoRecognizeManager.MaximumPagesBeforeLtd = 4

         ' Use maximum CPUs/cores of current machine to speed up recognition
         ' Either passing 0 or System.Environment.ProcessorCount
         ocrAutoRecognizeManager.MaximumThreadsPerJob = 0

         ' Deskew and auto-orient all pages before recognition
         ocrAutoRecognizeManager.PreprocessPageCommands.Clear()
         ocrAutoRecognizeManager.PreprocessPageCommands.Add(OcrAutoPreprocessPageCommand.Deskew)
         ocrAutoRecognizeManager.PreprocessPageCommands.Add(OcrAutoPreprocessPageCommand.Rotate)

         ' Create PDFs with Image/Text option
         Dim pdfOptions As PdfDocumentOptions = CType(ocrEngine.DocumentWriterInstance.GetOptions(DocumentFormat.Pdf), PdfDocumentOptions)
         pdfOptions.ImageOverText = True
         ocrEngine.DocumentWriterInstance.SetOptions(DocumentFormat.Pdf, pdfOptions)

         ' Loop through all the TIF files in the source directory, convert to PDF in the destination directory
         For Each imageFile As String In imageFiles
            ' Construct the name of the document file
            Dim documentFileName As String = Path.Combine(destinationDirectory, Path.GetFileNameWithoutExtension(imageFile))
            documentFileName = Path.ChangeExtension(documentFileName, "pdf")

            ' OCR the file
            Console.WriteLine("Processing {0}", imageFile)
            ocrAutoRecognizeManager.Run(imageFile, documentFileName, DocumentFormat.Pdf, Nothing, Nothing)
            Console.WriteLine("Saved: {0}", documentFileName)
         Next
      End Using
   End Sub

Public NotInheritable Class LEAD_VARS
   Public Const ImagesDir As String = "C:\Users\Public\Documents\LEADTOOLS Images"
End Class
private static void OcrAutoRecognizeManagerExample()
   {
      Console.WriteLine("Preparing the source and destination directories...");
      string sourceDirectory = LEAD_VARS.ImagesDir;
      string destinationDirectory = Path.Combine(LEAD_VARS.ImagesDir, "AutoRecognizeManagerExample");

      // Prepare the output directory
      if(!Directory.Exists(destinationDirectory))
      {
         Directory.CreateDirectory(destinationDirectory);
      }

      // OCR some images from the source directory into the destination directory:
      IList<string> imageFiles = new List<string>();

      for(int i = 1; i <= 4; i++)
      {
         imageFiles.Add(Path.Combine(sourceDirectory, string.Format("Ocr{0}.tif", i)));
      }

      Console.WriteLine("Creating an instance of the engine...");

      // Create an instance of the engine
      using(IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Professional, false))
      {
         // Start the engine using default parameters
         Console.WriteLine("Starting up the engine...");
         ocrEngine.Startup(null, null, null, null);

         IOcrAutoRecognizeManager ocrAutoRecognizeManager = ocrEngine.AutoRecognizeManager;

         // Use LTD as a temporary format if a document has more than 4 pages to save memory
         ocrAutoRecognizeManager.MaximumPagesBeforeLtd = 4;

         // Use maximum CPUs/cores of current machine to speed up recognition
         // Either passing 0 or System.Environment.ProcessorCount
         ocrAutoRecognizeManager.MaximumThreadsPerJob = 0;

         // Deskew and auto-orient all pages before recognition
         ocrAutoRecognizeManager.PreprocessPageCommands.Clear();
         ocrAutoRecognizeManager.PreprocessPageCommands.Add(OcrAutoPreprocessPageCommand.Deskew);
         ocrAutoRecognizeManager.PreprocessPageCommands.Add(OcrAutoPreprocessPageCommand.Rotate);

         // Create PDFs with Image/Text option
         PdfDocumentOptions pdfOptions = ocrEngine.DocumentWriterInstance.GetOptions(DocumentFormat.Pdf) as PdfDocumentOptions;
         pdfOptions.ImageOverText = true;
         ocrEngine.DocumentWriterInstance.SetOptions(DocumentFormat.Pdf, pdfOptions);

         // Loop through all the TIF files in the source directory, convert to PDF in the destination directory
         foreach(string imageFile in imageFiles)
         {
            // Construct the name of the document file
            string documentFileName = Path.Combine(destinationDirectory, Path.GetFileNameWithoutExtension(imageFile));
            documentFileName = Path.ChangeExtension(documentFileName, "pdf");

            // OCR the file
            Console.WriteLine("Processing {0}", imageFile);
            ocrAutoRecognizeManager.Run(imageFile, documentFileName, DocumentFormat.Pdf, null, null);
            Console.WriteLine("Saved: {0}", documentFileName);
         }
      }
   }

static class LEAD_VARS
{
   public const string ImagesDir = @"C:\Users\Public\Documents\LEADTOOLS Images";
}
Requirements

Target Platforms: Windows 7, Windows Vista SP1 or later, Windows XP SP3, Windows Server 2008 (Server Core not supported), Windows Server 2008 R2 (Server Core supported with SP1 or later), Windows Server 2003 SP2

See Also

Reference

IOcrAutoRecognizeManager Members
Leadtools.Forms.Ocr Namespace
Leadtools.Forms.DocumentWriters.DocumentFormat
IOcrEngine Interface
OcrEngineManager Class
OcrEngineType Enumeration
Programming with the LEADTOOLS .NET OCR
Working with OCR Pages
Multi-Threading with LEADTOOLS OCR
LEADTOOLS OCR Thunk Server
Files to be Included with Your Application

Provides support for the one shot "fire and forget" approach to OCR suitable for unattended recognition. .NET support
Object Model
IOcrAutoRecognizeManager Interface
Syntax
public interface IOcrAutoRecognizeManager 
'Declaration
 
Public Interface IOcrAutoRecognizeManager 
'Usage
 
Dim instance As IOcrAutoRecognizeManager
public interface IOcrAutoRecognizeManager 
function Leadtools.Forms.Ocr.IOcrAutoRecognizeManager() 
public interface class IOcrAutoRecognizeManager 
Remarks

You can access the instance of the IOcrAutoRecognizeManager used by an IOcrEngine through the IOcrEngine.AutoRecognizeManager property.

The members of this interface will let you create a document from an image file on disk with optional progress and status monitors.

You can use the Run methods to convert in one line of code an image on disk to a final document with any of the document formats supported by this IOcrEngine.

You can also create jobs using the CreateJob method and then run them synchronously through RunJob or asynchronously through RunJobAsync.

The IOcrAutoRecognizeManager interface also has the following options to use with the Run, RunJob and RunJobAsync methods:

Member Description
MaximumPagesBeforeLtd

Add support for converting a document with unlimited number of pages. An OCR recognition operation on a document that contains a large amount of pages (10 and more) might result in an out of memory error.

All of the LEADTOOLS OCR engines supports saving the intermediate recognition results to a temporary LTD file (DocumentFormat.LTD). The result of subsequent pages will be appended to this temporary file. When all the pages of the document have been recognized, the engine will convert the temporary LTD file to the desired output format.

The MaximumPagesBeforeLtd property defines the maximum number of pages processed as a whole. For example, if the original document has 20 pages and the value of this property is 8, the engine will recognize the first 8 pages and saves the result to a temporary file, recognizes the second 8 pages and append the results, and finally, recognize the last 4 pages and convert the temporary document into the final format.

PreprocessPageCommands

Holds an array of OcrAutoPreprocessPageCommand items to control what auto-preprocess operation to perform on each page document prior to recognition.

MaximumThreadsPerJob

Maximum number of threads to use per job. You can instruct IOcrAutoRecognizeManager to use all available machine CPUs/cores when recognizing a document. This will greatly reduce the time required to finish the OCR operation.

JobErrorMode

Ability to resume on none critical errors. For example, if a source document has a page that could not be recognized. The offending page will be added to the final document as a graphics images and recognition will continue to the next page.

JobStarted, JobProgress, JobOperation and JobCompleted events

Events to track when both synchronous and asynchronous jobs has started, being run and completed.

AbortAllJobs

Aborts all running and pending jobs.

EnableTrace

Output debug messages to the standard .NET trace listeners.

Some OCR engine types support creating multi-threaded documents by creating one IOcrEngine and multiple IOcrDocument or IOcrAutoRecognizeJob each in its own dedicated threads. For more information, refer to Multi-Threading with LEADTOOLS OCR.

Example
Copy CodeCopy Code  
Private Shared Sub OcrAutoRecognizeManagerExample()
      Console.WriteLine("Preparing the source and destination directories...")
      Dim sourceDirectory As String = LEAD_VARS.ImagesDir
      Dim destinationDirectory As String = Path.Combine(LEAD_VARS.ImagesDir, "AutoRecognizeManagerExample")

      ' Prepare the output directory
      If Not Directory.Exists(destinationDirectory) Then
         Directory.CreateDirectory(destinationDirectory)
      End If

      ' OCR some images from the source directory into the destination directory:
      Dim imageFiles As New List(Of String)

      For i As Integer = 1 To 4
         imageFiles.Add(Path.Combine(sourceDirectory, String.Format("Ocr{0}.tif", i)))
      Next

      Console.WriteLine("Creating an instance of the engine...")

      ' Create an instance of the engine
      Using ocrEngine As IOcrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Professional, False)
         ' Start the engine using default parameters
         Console.WriteLine("Starting up the engine...")
         ocrEngine.Startup(Nothing, Nothing, Nothing, Nothing)

         Dim ocrAutoRecognizeManager As IOcrAutoRecognizeManager = ocrEngine.AutoRecognizeManager

         ' Use LTD as a temporary format if a document has more than 4 pages to save memory
         ocrAutoRecognizeManager.MaximumPagesBeforeLtd = 4

         ' Use maximum CPUs/cores of current machine to speed up recognition
         ' Either passing 0 or System.Environment.ProcessorCount
         ocrAutoRecognizeManager.MaximumThreadsPerJob = 0

         ' Deskew and auto-orient all pages before recognition
         ocrAutoRecognizeManager.PreprocessPageCommands.Clear()
         ocrAutoRecognizeManager.PreprocessPageCommands.Add(OcrAutoPreprocessPageCommand.Deskew)
         ocrAutoRecognizeManager.PreprocessPageCommands.Add(OcrAutoPreprocessPageCommand.Rotate)

         ' Create PDFs with Image/Text option
         Dim pdfOptions As PdfDocumentOptions = CType(ocrEngine.DocumentWriterInstance.GetOptions(DocumentFormat.Pdf), PdfDocumentOptions)
         pdfOptions.ImageOverText = True
         ocrEngine.DocumentWriterInstance.SetOptions(DocumentFormat.Pdf, pdfOptions)

         ' Loop through all the TIF files in the source directory, convert to PDF in the destination directory
         For Each imageFile As String In imageFiles
            ' Construct the name of the document file
            Dim documentFileName As String = Path.Combine(destinationDirectory, Path.GetFileNameWithoutExtension(imageFile))
            documentFileName = Path.ChangeExtension(documentFileName, "pdf")

            ' OCR the file
            Console.WriteLine("Processing {0}", imageFile)
            ocrAutoRecognizeManager.Run(imageFile, documentFileName, DocumentFormat.Pdf, Nothing, Nothing)
            Console.WriteLine("Saved: {0}", documentFileName)
         Next
      End Using
   End Sub

Public NotInheritable Class LEAD_VARS
   Public Const ImagesDir As String = "C:\Users\Public\Documents\LEADTOOLS Images"
End Class
private static void OcrAutoRecognizeManagerExample()
   {
      Console.WriteLine("Preparing the source and destination directories...");
      string sourceDirectory = LEAD_VARS.ImagesDir;
      string destinationDirectory = Path.Combine(LEAD_VARS.ImagesDir, "AutoRecognizeManagerExample");

      // Prepare the output directory
      if(!Directory.Exists(destinationDirectory))
      {
         Directory.CreateDirectory(destinationDirectory);
      }

      // OCR some images from the source directory into the destination directory:
      IList<string> imageFiles = new List<string>();

      for(int i = 1; i <= 4; i++)
      {
         imageFiles.Add(Path.Combine(sourceDirectory, string.Format("Ocr{0}.tif", i)));
      }

      Console.WriteLine("Creating an instance of the engine...");

      // Create an instance of the engine
      using(IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Professional, false))
      {
         // Start the engine using default parameters
         Console.WriteLine("Starting up the engine...");
         ocrEngine.Startup(null, null, null, null);

         IOcrAutoRecognizeManager ocrAutoRecognizeManager = ocrEngine.AutoRecognizeManager;

         // Use LTD as a temporary format if a document has more than 4 pages to save memory
         ocrAutoRecognizeManager.MaximumPagesBeforeLtd = 4;

         // Use maximum CPUs/cores of current machine to speed up recognition
         // Either passing 0 or System.Environment.ProcessorCount
         ocrAutoRecognizeManager.MaximumThreadsPerJob = 0;

         // Deskew and auto-orient all pages before recognition
         ocrAutoRecognizeManager.PreprocessPageCommands.Clear();
         ocrAutoRecognizeManager.PreprocessPageCommands.Add(OcrAutoPreprocessPageCommand.Deskew);
         ocrAutoRecognizeManager.PreprocessPageCommands.Add(OcrAutoPreprocessPageCommand.Rotate);

         // Create PDFs with Image/Text option
         PdfDocumentOptions pdfOptions = ocrEngine.DocumentWriterInstance.GetOptions(DocumentFormat.Pdf) as PdfDocumentOptions;
         pdfOptions.ImageOverText = true;
         ocrEngine.DocumentWriterInstance.SetOptions(DocumentFormat.Pdf, pdfOptions);

         // Loop through all the TIF files in the source directory, convert to PDF in the destination directory
         foreach(string imageFile in imageFiles)
         {
            // Construct the name of the document file
            string documentFileName = Path.Combine(destinationDirectory, Path.GetFileNameWithoutExtension(imageFile));
            documentFileName = Path.ChangeExtension(documentFileName, "pdf");

            // OCR the file
            Console.WriteLine("Processing {0}", imageFile);
            ocrAutoRecognizeManager.Run(imageFile, documentFileName, DocumentFormat.Pdf, null, null);
            Console.WriteLine("Saved: {0}", documentFileName);
         }
      }
   }

static class LEAD_VARS
{
   public const string ImagesDir = @"C:\Users\Public\Documents\LEADTOOLS Images";
}
Requirements

Target Platforms: Windows 7, Windows Vista SP1 or later, Windows XP SP3, Windows Server 2008 (Server Core not supported), Windows Server 2008 R2 (Server Core supported with SP1 or later), Windows Server 2003 SP2

See Also

Reference

IOcrAutoRecognizeManager Members
Leadtools.Forms.Ocr Namespace
Leadtools.Forms.DocumentWriters.DocumentFormat
IOcrEngine Interface
OcrEngineManager Class
OcrEngineType Enumeration
Programming with the LEADTOOLS .NET OCR
Working with OCR Pages
Multi-Threading with LEADTOOLS OCR
LEADTOOLS OCR Thunk Server
Files to be Included with Your Application

 

 


Products | Support | Contact Us | Copyright Notices

© 2006-2012 All Rights Reserved. LEAD Technologies, Inc.

IOcrAutoRecognizeManager requires an OCR module license and unlock key. For more information, refer to: Imaging Pro/Document/Medical Features