Programming with Leadtools .NET OCR

The LEADTOOLS OCR Class Library provides methods for incorporating optical character recognition (OCR) technology into an application. OCR is used to process bitmap document images into text.

Once the LEADTOOLS .NET OCR toolkit is installed to the system, the user is ready to begin programming with LEADTOOLS OCR. Please note that the OCR features must be unlocked before the user can actually use the OCR properties, methods, and events. For more information on unlocking LEAD features, refer to Unlocking Special LEAD Features.

You can start using LEADTOOLS for .NET OCR in your application by adding a references to the Leadtools.Forms.Ocr.dll and Leadtools.Forms.DocumentWriter.dll assemblies in your .NET application. These assemblies contain the various interfaces, classes, structures and delegates used to program with LEADTOOLS OCR.

Since the toolkit supports multiple engines, the actual code that interfaces with the engine is stored in a separate assembly that will be loaded dynamically once an instance of the IOcrEngine interface is created. Hence, you must make sure the engine assembly you are planning to use resides next to the Leadtools.Forms.Ocr.dll assembly. You can add the engine assembly as a reference to your project—if desired—to automatically detect dependencies, even though this is not required by LEADTOOLS.

LEADTOOLS provides methods to:

Recognize and export text, choosing from a variety of text, word processing, database, or spreadsheet file formats.
Perform OCR processes in a single or multi-threaded environment with optimization for server-based operations.
Multiple OCR engines are supported and abstracted from the user through the use of a common .NET class library. Switching between the various engines requires virtually no changes in the application code.
Select the language of documents to be recognized. Choose from English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish, or Swedish.
Segment complex pages manually or automatically into text zones, image zones, table zones, lines, headers, and footers.
Set accuracy thresholds prior to recognition to control the accuracy of recognition.
Recognize text from 5 to 72 points in virtually any typeface.
Increase recognition accuracy with built-in and user dictionaries.
Automatically detect fax, dot matrix, and other degraded documents and compensate accordingly.
Process both text and graphics. The recognition software's ability to distinguish halftone graphics from text can provide the basis of a compound document processing system.
Save the document in any of 40 formats, including Adobe PDF and PDF/A, MS Word, MS Excel, as well as various flavors of ASCII and UNICODE text.

Using the OCR Engine

LEADTOOLS uses an OCR handle to interact with the OCR engine and the OCR document containing the list of pages. The OCR handle is a communication session between LEADTOOLS OCR and an OCR engine installed on the system. This OCR handle is an internal structure that contains all the necessary information for recognition, getting and setting information, and text verification.

The following is an outline of the general steps involved in recognizing one or more pages.

Select the engine type you wish to use and create an instance of the IOcrEngine interface. For more information, refer to Creating an OCR Engine Instance.
Startup the OCR Engine with the IOcrEngine.Startup method. For more information, refer to Starting and Shutting down the Engine.
Optional. If save is required, establish an OCR document with one or more pages. For more information, refer to Working with OCR Pages.
Establish zones on the page(s), either manually or automatically. (This is optional. A page can be recognized with or without zones.) For more information, refer to Working with OCR Zones.
Optional. Set the active languages to be used by the OCR engine. (The default is English). For more information, refer to Working with OCR Languages.
Optional. Set the spell checking properties. For more information, refer to OCR Spell Language Dictionaries.
Optional. Set any special recognition module options. This is required only if the page contains zones, created either automatically or manually. For more information, refer to Recognizing OCR Pages and Using OMR in LEADTOOLS .NET OCR.
Recognize. For more information, refer to Recognizing OCR Pages.
Optional. Save recognition results, if desired. The results can be saved to either a file or to memory. For more information, refer to Recognizing OCR Pages.
Shut down the OCR engine when finished. For more information, refer to Starting and Shutting down the Engine.

Steps 4, 5, 6 and 7 can pretty much be done in any order, as long as they are carried out after starting up the OCR engine and before recognizing a page.

For more information on the engine assemblies, refer to OcrEngineType and Files To Be Included With Your Application .

From the general steps above, the LEADTOOLS OCR engine can be used in multiple ways:

OCR an image file (or LEADTOOLS RasterImage object) and obtain the text with optional formatting and position info. In this mode, an IOcrDocument object is not needed since the result is not going to be saved. The IOcrEngine.CreatePage method can be used to quickly create an IOcrPage from the RasterImage directly, call the necessary method (such as IOcrPage.Recognize) and then obtain the text directly using IOcrPage.GetText or IOcrPage.GetRecognizedCharacters. For an example, refer to IOcrEngine.CreatePage.

Low-level Optical Character Recognition of one or more pages followed by creating a final document such as PDF or DOCX. In this mode, the user generally creates an IOcrDocument object (in memory or file based) and then add IOcrPage objects to it. The pages can be previously recognized or are recognized at a later time. When all the pages are added and recognized, IOcrDocument.Save is called to convert the recognition data to the final document. For an example, refer to IOcrDocument.

High-level Optical Character Recognition from an input image file to a final document such as PDF or DOCX. In this mode, you can use IOcrAutoRecognizeManager to convert the document in one shot. Various events and logging mechanism can be used to modify and track the recognize operation. For an example, refer to IOcrAutoRecognizeManager.

The following example shows how to perform the above steps in code:

// Assuming you added "using Leadtools.Codecs;", "using Leadtools.Forms.Ocr;" and "using Leadtools.Forms.DocumentWriters;" at the beginning of this class 
// *** Step 1: Select the engine type and create an instance of the IOcrEngine interface. 
             
// We will use the LEADTOOLS OCR Advantage engine and use it in the same process 
IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false); 
             
// *** Step 2: Startup the engine. 
             
// Use the default parameters 
ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime"); 
             
// *** Step 3: Create an OCR document with one or more pages. 
             
IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument(); 
              
// Add all the pages of a multi-page TIF image to the document 
ocrDocument.Pages.AddPages(@"C:\Users\Public\Documents\LEADTOOLS Images\Ocr.tif", 1, -1, null); 
             
// *** Step 4: Establish zones on the page(s), either manually or automatically 
             
// Automatic zoning 
ocrDocument.Pages.AutoZone(null); 
             
// *** Step 5: (Optional) Set the active languages to be used by the OCR engine 
             
// Enable English and German languages 
ocrEngine.LanguageManager.EnableLanguages(new string[] { "en", "de" }); 
             
// *** Step 6: (Optional) Set the spell checking engine 
             
// Enable the spell checking system 
ocrEngine.SpellCheckManager.SpellCheckEngine = OcrSpellCheckEngine.Native; 
             
// *** Step 7: (Optional) Set any special recognition module options 
             
// Change the zone method for the first zone in the first page to be Graphics so it will not be recognized 
OcrZone ocrZone = ocrDocument.Pages[0].Zones[0]; 
ocrZone.ZoneType = OcrZoneFillMethod.Graphics; 
ocrDocument.Pages[0].Zones[0] = ocrZone; 
             
// *** Step 8: Recognize 
             
ocrDocument.Pages.Recognize(null); 
             
// *** Step 9: Save recognition results 
             
// Save the results to a PDF file 
ocrDocument.Save(@"C:\Users\Public\Documents\LEADTOOLS Images\Document.pdf", DocumentFormat.Pdf, null); 
ocrDocument.Dispose(); 
             
// *** Step 10: Shut down the OCR engine when finished 
ocrEngine.Shutdown(); 
ocrEngine.Dispose();

' Assuming you added "Imports Leadtools.Forms.Ocr" and "Imports Leadtools.Forms.DocumentWriter" at the beginning of this class 
' *** Step 1: Select the engine type and create an instance of the IOcrEngine interface. 
' We will use the LEADTOOLS OCR Advantage engine and use it in the same process 
Dim ocrEngine As IOcrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, False) 
             
' *** Step 2: Startup the engine. 
             
' Use the default parameters 
ocrEngine.Startup(Nothing, Nothing, Nothing, "C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime") 
             
' *** Step 3: Create an OCR document with one or more pages. 
             
Dim ocrDocument As IOcrDocument = ocrEngine.DocumentManager.CreateDocument() 
             
' Add all the pages of a multi-page TIF image to the document 
ocrDocument.Pages.AddPages("C:\Users\Public\Documents\LEADTOOLS Images\Ocr.tif", 1, -1, Nothing) 
             
' *** Step 4: Establish zones on the page(s), either manually or automatically 
             
' Automatic zoning 
ocrDocument.Pages.AutoZone(Nothing) 
             
' *** Step 5: (Optional) Set the active languages to be used by the OCR engine 
             
' Enable English and German languages 
ocrEngine.LanguageManager.EnableLanguages(New String() {"en", "de"}) 
             
' *** Step 6: (Optional) Set the spell checking engine 
' Enable the spell checking engine 
ocrEngine.SpellCheckManager.SpellCheckEngine = OcrSpellCheckEngine.Native 
             
' *** Step 7: (Optional) Set any special recognition module options 
             
' Change the zone method for the first zone in the first page to be Graphics so it will not be recognized 
Dim ocrZone As OcrZone = ocrDocument.Pages(0).Zones(0) 
ocrZone.ZoneType = OcrZoneType.Graphics 
ocrDocument.Pages(0).Zones(0) = ocrZone 
             
' *** Step 8: Recognize 
             
ocrDocument.Pages.Recognize(Nothing) 
             
            ' *** Step 9: Save recognition results 
             
            ' Save the results to a PDF file 
ocrDocument.Save("C:\Users\Public\Documents\LEADTOOLS Images\Document.pdf", DocumentFormat.Pdf, Nothing) 
ocrDocument.Dispose() 
             
' *** Step 10: Shut down the OCR engine when finished 
ocrEngine.Shutdown() 
ocrEngine.Dispose()

From the general steps above, the LEADTOOLS OCR engine can be used in multiple ways:

Using IOcrPage

Note: This mode is supported only by the LEADTOOLS OCR Advantage engine. Calling IOcrEngine.CreatePage using any other OCR engine will result in an exception being thrown.

The following example using an OCR page without a document.

// Create the engine instance 
using (IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false)) 
{ 
   // Startup the engine 
   ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime"); 
              
   // Load the first page as RasterImage 
   RasterImage rasterImage = ocrEngine.RasterCodecsInstance.Load(@"C:\Users\Public\Documents\LEADTOOLS Images\Ocr.tif", 1); 
              
   // Create an OCR page from this image, transform ownership of the RasterImage object 
   using (IOcrPage ocrPage = ocrEngine.CreatePage(rasterImage, OcrImageSharingMode.AutoDispose)) 
   { 
      // Recognize the page 
      ocrPage.Recognize(null); 
              
      // Show the text of all zones 
      for (int zoneIndex = 0; zoneIndex < ocrPage.Zones.Count; zoneIndex++) 
      { 
         string text = ocrPage.GetText(zoneIndex); 
         Console.WriteLine(text); 
      } 
   } 
              
   // The engine will automatically shuts down when Dispose is called 
}

' Create the engine instance 
Using ocrEngine As IOcrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, False) 
   ' Startup the engine 
   ocrEngine.Startup(Nothing, Nothing, Nothing, "C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime") 
              
   ' Load the first page as RasterImage 
   Dim rasterImage As RasterImage = ocrEngine.RasterCodecsInstance.Load("C:\Users\Public\Documents\LEADTOOLS Images\Ocr.tif", 1) 
              
   ' Create an OCR page from this image, transform ownership of the RasterImage object 
   Using ocrPage As IOcrPage = ocrEngine.CreatePage(RasterImage, OcrImageSharingMode.AutoDispose) 
      ' Recognize the page 
      ocrPage.Recognize(Nothing) 
              
      ' Show the text of all zones 
      For zoneIndex As Integer = 0 To ocrPage.Zones.Count - 1 
         Dim text As String = ocrPage.GetText(zoneIndex) 
         Console.WriteLine(text) 
      Next 
   End Using 
              
   ' The engine will automatically shuts down when Dispose is called 
End Using

Using IOcrDocument

An instance of IOcrDocument is needed to save the OCR results to a final document such as PDF or DOCX. One or more OCR page can be added to the document and then the various Save methods can be called to create the final document.

IOcrDocument can be used in two ways:

Memory-Based Documents

In this mode, the OCR pages are required to be in memory before saving. This is not recommended when the document have a large amount of pages and either using a file-based document or using the LEADTOOLS Temporary file format (DocumentFormat.Ltd is required.

In memory-based IOcrDocument, the IOcrPageCollection holds the pages. The user can recognize any or all of the pages at any time and pages can be added or removed at will.

The following example uses a memory-based document to create a multi-page PDF file. Note how all the pages are kept in memory during save:

// Create the engine instance 
using (IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false)) 
{ 
   // Startup the engine 
   ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime"); 
              
   // Create the OCR document in memory 
   using (IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument(null, OcrCreateDocumentOptions.InMemory)) 
   { 
      string imageFile = @"C:\Users\Public\Documents\LEADTOOLS Images\Ocr.tif"; 
             
      // Add all the pages to the document 
      ocrDocument.Pages.AddPages(imageFile, 1, -1, null); 
              
      // Recognize all the pages 
      ocrDocument.Pages.Recognize(null); 
              
      // Save recognition results as PDF 
      ocrDocument.Save(@"C:\Users\Public\Documents\LEADTOOLS Images\Document.pdf", DocumentFormat.Pdf, null); 
   } 
}

' Create the engine instance 
Using ocrEngine As IOcrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, False) 
   ' Startup the engine 
   ocrEngine.Startup(Nothing, Nothing, Nothing, "C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime") 
              
   ' Create the OCR document in memory 
   Using ocrDocument As IOcrDocument = ocrEngine.DocumentManager.CreateDocument(Nothing, OcrCreateDocumentOptions.InMemory) 
      Dim imageFile As String = "C:\Users\Public\Documents\LEADTOOLS Images\Ocr.tif" 
              
      ' Add all the pages to the document 
      ocrDocument.Pages.AddPages(imageFile, 1, -1, Nothing) 
              
      ' Recognize all the pages 
      ocrDocument.Pages.Recognize(Nothing) 
              
      ' Save recognition results as PDF 
      ocrDocument.Save("C:\Users\Public\Documents\LEADTOOLS Images\Document.pdf", DocumentFormat.Pdf, Nothing) 
   End Using 
End Using

File-Based Documents

In this mode, the OCR pages are not required to be in memory before saving. This mode is recommended when the document have a large amount of pages.

In file-based IOcrDocument, the IOcrPageCollection is a store only view of the pages. when page is added, a snapshot of the current recognition data is saved into the document. This data cannot be modified any more and the page is no longer needed. The user must recognize the pages before they are added to the document and pages can only be added but not removed.

The following example uses a file-based document to create a multi-page PDF file. Notice how the pages are disposed after they are recognized and not required during save.

{                                  }

codeContainer Generic">// Create the engine instance  style="color:Blue;">using (IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false))   // Startup the engine  ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime");   // Create a file-based OCR document  using (IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument(null, OcrCreateDocumentOptions.AutoDeleteFile))  {  string imageFile = @"C:\Users\Public\Documents\LEADTOOLS Images\Ocr.tif";   // Get the number of pages in the document  int pageCount = ocrEngine.RasterCodecsInstance.GetTotalPages(imageFile);   // Create a page  for (int page = 1; page <= pageCount; page++)  {  // Load a RasterImage  RasterImage rasterImage = ocrEngine.RasterCodecsInstance.Load(imageFile, page);   // Create an OCR page from this image, transform ownership of the RasterImage object  using (IOcrPage ocrPage = ocrEngine.CreatePage(rasterImage, OcrImageSharingMode.AutoDispose))  {  // Recognize the page  ocrPage.Recognize(null);   // Add it to the document  ocrDocument.Pages.Add(ocrPage);   // Page will be disposed here and its memory freed  }  }   // Save recognition results as PDF  ocrDocument.Save(@"C:\Users\Public\Documents\LEADTOOLS Images\Document.pdf", DocumentFormat.Pdf, null);  }

' Create the engine instance 
Using ocrEngine As IOcrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, False) 
   ' Startup the engine 
   ocrEngine.Startup(Nothing, Nothing, Nothing, "C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime") 
              
   ' Create a file-based OCR document 
   Using ocrDocument As IOcrDocument = ocrEngine.DocumentManager.CreateDocument(Nothing, OcrCreateDocumentOptions.AutoDeleteFile) 
      Dim imageFile As String = "C:\Users\Public\Documents\LEADTOOLS Images\Ocr.tif" 
             
      ' Get the number of pages in the document 
      Dim pageCount As Integer = ocrEngine.RasterCodecsInstance.GetTotalPages(imageFile) 
             
      ' Create a page 
      For page As Integer = 1 To pageCount 
         ' Load a RasterImage 
         Dim rasterImage As RasterImage = ocrEngine.RasterCodecsInstance.Load(imageFile, page) 
              
         ' Create an OCR page from this image, transform ownership of the RasterImage object 
         Using ocrPage As IOcrPage = ocrEngine.CreatePage(rasterImage, OcrImageSharingMode.AutoDispose) 
            ' Recognize the page 
            ocrPage.Recognize(Nothing) 
              
            ' Add it to the document 
            ocrDocument.Pages.Add(ocrPage) 
              
            ' Page will be disposed here and its memory freed 
         End Using 
      Next 
              
      ' Save recognition results as PDF 
         ocrDocument.Save(@"C:\Users\Public\Documents\LEADTOOLS Images\Document.pdf", DocumentFormat.Pdf, Nothing) 
   End Using 
End Using

File-based documents can also be saved and re-loaded to continue adding pages or converting to final document at a later time. The following example shows you how to perform that:

private static void Test4() 
{ 
   // Create the engine instance 
   IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false); 
   // Startup the engine 
   ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime"); 
             
   string imageFile1 = @"C:\Users\Public\Documents\LEADTOOLS Images\Ocr1.tif"; 
   string imageFile2 = @"C:\Users\Public\Documents\LEADTOOLS Images\Ocr2.tif"; 
             
   // Create a file-based OCR document 
   // Pass a file name (will be re-used) and tell the document to not delete it 
   string documentFile = @"C:\Users\Public\Documents\LEADTOOLS Images\document.bin"; 
   using (IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument(documentFile, OcrCreateDocumentOptions.None)) 
   { 
      // Verify the document does not have any pages 
      System.Diagnostics.Debug.Assert(ocrDocument.Pages.Count == 0); 
             
      // Add a page 
      RasterImage rasterImage = ocrEngine.RasterCodecsInstance.Load(imageFile1, 1); 
      using (IOcrPage ocrPage = ocrEngine.CreatePage(rasterImage, OcrImageSharingMode.AutoDispose)) 
      { 
         ocrPage.Recognize(null); 
         ocrDocument.Pages.Add(ocrPage); 
      } 
             
      // Here the document is disposed but the file will not be deleted 
   } 
             
   // Re-load the document 
   using (IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument(documentFile, OcrCreateDocumentOptions.LoadExisting)) 
   { 
      // Verify the document has one page 
      System.Diagnostics.Debug.Assert(ocrDocument.Pages.Count == 1); 
             
      // Add another page 
      RasterImage rasterImage = ocrEngine.RasterCodecsInstance.Load(imageFile2, 1); 
      using (IOcrPage ocrPage = ocrEngine.CreatePage(rasterImage, OcrImageSharingMode.AutoDispose)) 
      { 
         ocrPage.Recognize(null); 
         ocrDocument.Pages.Add(ocrPage); 
      } 
             
      // Verify that the document has 2 pages 
      System.Diagnostics.Debug.Assert(ocrDocument.Pages.Count == 2); 
             
      // Save the document 
      ocrDocument.Save(@"C:\Users\Public\Documents\LEADTOOLS Images\Document.pdf", DocumentFormat.Pdf, null); 
             
      // Result will be a PDF file with two pages 
   } 
             
   // Finally, delete the document file 
   System.IO.File.Delete(documentFile); 
   ocrEngine.Dispose(); 
}

' Create the engine instance 
Dim ocrEngine As IOcrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, False) 
' Startup the engine 
ocrEngine.Startup(Nothing, Nothing, Nothing, "C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime") 
             
Dim imageFile1 As String = "C:\Users\Public\Documents\LEADTOOLS Images\Ocr1.tif" 
Dim imageFile2 As String = "C:\Users\Public\Documents\LEADTOOLS Images\Ocr2.tif" 
             
' Create a file-based OCR document 
' Pass a file name (will be re-used) and tell the document to not delete it 
Dim documentFile As String = "C:\Users\Public\Documents\LEADTOOLS Images\document.bin" 
Using ocrDocument As IOcrDocument = ocrEngine.DocumentManager.CreateDocument(documentFile, OcrCreateDocumentOptions.None) 
   ' Verify the document does not have any pages 
   System.Diagnostics.Debug.Assert(ocrDocument.Pages.Count = 0) 
             
   ' Add a page 
   Dim rasterImage As RasterImage = ocrEngine.RasterCodecsInstance.Load(imageFile1, 1) 
   Using ocrPage As IOcrPage = ocrEngine.CreatePage(rasterImage, OcrImageSharingMode.AutoDispose) 
      ocrPage.Recognize(Nothing) 
      ocrDocument.Pages.Add(ocrPage) 
   End Using 
             
   ' Here the document is disposed but the file will not be deleted 
End Using 
             
' Re-load the document 
Using ocrDocument As IOcrDocument = ocrEngine.DocumentManager.CreateDocument(documentFile, OcrCreateDocumentOptions.LoadExisting) 
   ' Verify the document has one page 
   System.Diagnostics.Debug.Assert(ocrDocument.Pages.Count = 1) 
             
   ' Add another page 
   Dim rasterImage As RasterImage = ocrEngine.RasterCodecsInstance.Load(imageFile2, 1) 
   Using ocrPage As IOcrPage = ocrEngine.CreatePage(rasterImage, OcrImageSharingMode.AutoDispose) 
      ocrPage.Recognize(Nothing) 
      ocrDocument.Pages.Add(ocrPage) 
   End Using 
             
   ' Verify that the document has 2 pages 
   System.Diagnostics.Debug.Assert(ocrDocument.Pages.Count = 2) 
             
   ' Save the document 
   ocrDocument.Save("C:\Users\Public\Documents\LEADTOOLS Images\Document.pdf", DocumentFormat.Pdf, Nothing) 
             
   ' Result will be a PDF file with two pages 
End Using 
             
' Finally, delete the document file 
System.IO.File.Delete(documentFile) 
             
ocrEngine.Dispose()

Using IOcrAutoRecognizeManager

All the previous techniques required low-level code to load a page, recognize it and add it to a document. In addition to that, the LEADTOOLS OCR engines support performing the same task above using the one shot "fire and forget" IOcrAutoRecognizeManager interface. In this high-level OCR, the input image is converted directly to the output format using the best options using one method.

// Create the engine instance 
using(IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false)) 
{ 
   // Startup the engine 
   ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime"); 
             
   // Convert the multi-page TIF image to a PDF document 
   ocrEngine.AutoRecognizeManager.Run( 
      @"C:\Users\Public\Documents\LEADTOOLS Images\Ocr.tif", 
      @"C:\Users\Public\Documents\LEADTOOLS Images\Document.pdf", 
      DocumentFormat.Pdf, 
      null, 
      null); 
}

' Create the engine instance 
Using ocrEngine As IOcrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, False) 
   ' Startup the engine 
   ocrEngine.Startup(Nothing, Nothing, Nothing, "C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime") 
             
   ' Convert the multi-page TIF image to a PDF document 
   ocrEngine.AutoRecognizeManager.Run( _ 
      "C:\Users\Public\Documents\LEADTOOLS Images\Ocr.tif", _ 
      "C:\Users\Public\Documents\LEADTOOLS Images\Document.pdf", _ 
      DocumentFormat.Pdf, _ 
      Nothing, _ 
      Nothing) 
End Using