Take the following steps to create and run a program that shows how scan a document and convert it to a searchable PDF file.
Start Visual Studio
Choose File->New->Project from the menu.
In the New Project dialog box, choose either "Visual C# Projects" or "VB Projects" in the Projects Type List, and choose "Windows Application" or "Windows Forms Application" depending on your Visual Studio version from the Templates List.
Type the project name as "OcrTutorial3" in the Project Name field, and then choose OK. If desired, type a new location for your project or select a directory using the Browse button, and then choose OK.
In the "Solution Explorer" window, right-click on the "References" folder, and select "Add Reference..." from the context menu. In the "Add Reference" dialog box, select the ".NET" tab and browse to LEADTOOLS For .NET "<LEADTOOLS_INSTALLDIR>\Bin\DotNet4\Win32" folder and select the following DLLs:
Note: The Leadtools.Codecs.*.dll references added are for the BMP, JPG, CMP, TIF and FAX image formats. Add any additional file format codec DLL if required in your application.
Drag and drop three buttons in Form1. Leave all the buttons names as the default "button1, button2 ...", then change the Text property of each button to the following:
| Button | Text |
|---|---|
| button1 | Change output directory |
| button2 | Select the Scanner |
| button3 | Scan and OCR |
Switch to Form1 code view (Right-click Form1 in the solution explorer then select View Code) and add the following lines at the beginning of the file after any using or Imports section if there are any:
using Leadtools;using Leadtools.Twain;using Leadtools.ImageProcessing;using Leadtools.ImageProcessing.Core;using Leadtools.Forms.Common;using Leadtools.Document.Writer;using Leadtools.Ocr;
Imports LeadtoolsImports Leadtools.TwainImports Leadtools.ImageProcessingImports Leadtools.ImageProcessing.CoreImports Leadtools.Forms.CommonImports Leadtools.Document.WriterImports Leadtools.Ocr
Add the following private variables to the Form1 class:
// The OCR engine instanceprivate IOcrEngine _ocrEngine;// OCR document instanceprivate IOcrDocument _ocrDocument;// The Twain sessionprivate TwainSession _twainSession;// The output directory for saving PDF filesprivate string _outputDirectory = @"C:\MyImages";// The image processing commands we are going to use to clean the scanned imageprivate List<RasterCommand> _imageProcessingCommands;private int _scanCount;
' The OCR engine instancePrivate _ocrEngine As IOcrEngine' OCR document instancePrivate _ocrDocument As IOcrDocument' The Twain sessionPrivate _twainSession As TwainSession' The output directory for saving PDF filesPrivate _outputDirectory As String = "C:\MyImages"' The image processing commands we are going to use to clean the scanned imagePrivate _imageProcessingCommands As List(Of RasterCommand)Private _scanCount As Integer
Override Form1OnLoad and add the following code:
protected override void OnLoad(EventArgs e){// Initialize the OCR engine_ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD, false);// Startup the engine_ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 20\Bin\Common\OcrLEADRuntime");// Initialize Twain scanning session_twainSession = new TwainSession();_twainSession.Startup(this.Handle, "My Company", "My Product", "My Version", "My Application", TwainStartupFlags.None);// Subscribe to the TwainSession.Acquire event to get the image_twainSession.AcquirePage += new EventHandler<TwainAcquirePageEventArgs>(_twainSession_AcquirePage);// Initialize the image processing commands we are going to use// Add as many as you like, here we will add Deskew and Despeckle_imageProcessingCommands = new List<RasterCommand>();_imageProcessingCommands.Add(new DeskewCommand());_imageProcessingCommands.Add(new DespeckleCommand());base.OnLoad(e);}
Protected Overrides Sub OnLoad(e As EventArgs)' Initialize the OCR engine_ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD, False)' Startup the engine_ocrEngine.Startup(Nothing, Nothing, Nothing, "C:\LEADTOOLS 20\Bin\Common\OcrLEADRuntime")' Initialize Twain scanning session_twainSession = New TwainSession()_twainSession.Startup(Me.Handle, "My Company", "My Product", "My Version", "My Application", TwainStartupFlags.None)' Subscribe to the TwainSession.Acquire event to get the imageAddHandler _twainSession.AcquirePage, AddressOf _twainSession_AcquirePage' Initialize the image processing commands we are going to use' Add as many as you like, here we will add Deskew and Despeckle_imageProcessingCommands = New List(Of RasterCommand)()_imageProcessingCommands.Add(New DeskewCommand())_imageProcessingCommands.Add(New DespeckleCommand())MyBase.OnLoad(e)End Sub
Override Form1OnFormClosed and add the following code:
protected override void OnFormClosed(FormClosedEventArgs e){// Shutdown and dispose the OCR engine_ocrEngine.Dispose();// And the twain session_twainSession.Shutdown();base.OnFormClosed(e);}
Protected Overrides Sub OnFormClosed(e As FormClosedEventArgs)' Shutdown and dispose the OCR engine_ocrEngine.Dispose()' And the twain session_twainSession.Shutdown()MyBase.OnFormClosed(e)End Sub
Add the following code for the button1 (Change output directory) Click handler:
private void button1_Click(object sender, EventArgs e){// Change the output directoryusing (FolderBrowserDialog dlg = new FolderBrowserDialog()){dlg.SelectedPath = _outputDirectory;dlg.ShowNewFolderButton = true;if (dlg.ShowDialog(this) == DialogResult.OK)_outputDirectory = System.IO.Path.GetFullPath(dlg.SelectedPath);}}
Private Sub button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles button1.Click' Change the output directoryUsing dlg As New FolderBrowserDialog()dlg.SelectedPath = _outputDirectorydlg.ShowNewFolderButton = TrueIf dlg.ShowDialog(Me) = DialogResult.OK Then_outputDirectory = System.IO.Path.GetFullPath(dlg.SelectedPath)End IfEnd UsingEnd Sub
Add the following code for the button2 (Select the Scanner) Click handler:
private void button2_Click(object sender, EventArgs e){// Select the scanner to use_twainSession.SelectSource(null);}
Private Sub button2_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles button2.Click' Select the scanner to use_twainSession.SelectSource(Nothing)End Sub
Add the following code for the button3 (Scan and Ocr) Click handler:
private void button3_Click(object sender, EventArgs e){// Create the output directory if it does not existif(!System.IO.Directory.Exists(_outputDirectory))System.IO.Directory.CreateDirectory(_outputDirectory);// Build the output PDF file namestring name = "Scanned" + _scanCount;_scanCount++;string pdfFileName = System.IO.Path.Combine(_outputDirectory, name + ".pdf");// Create a new file-based OCR document to add the scanned pages to_ocrDocument = _ocrEngine.DocumentManager.CreateDocument(null, OcrCreateDocumentOptions.AutoDeleteFile);// Scan the new page(s)_twainSession.Acquire(TwainUserInterfaceFlags.Show);// Save as PDF_ocrDocument.Save(pdfFileName, DocumentFormat.Pdf, null);// Delete the document_ocrDocument.Dispose();// Show the result PDF fileSystem.Diagnostics.Process.Start(pdfFileName);}
Private Sub button3_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles button3.Click' Create the output directory if it does not existIf Not System.IO.Directory.Exists(_outputDirectory) ThenSystem.IO.Directory.CreateDirectory(_outputDirectory)End If' Build the output PDF file nameDim name As String = "Scanned" + _scanCount_scanCount = _scanCount + 1Dim pdfFileName As String = System.IO.Path.Combine(_outputDirectory, name + ".pdf")' Create a new file-based OCR document to add the scanned pages to_ocrDocument = _ocrEngine.DocumentManager.CreateDocument(Nothing, OcrCreateDocumentOptions.AutoDeleteFile)' Scan the new page(s)_twainSession.Acquire(TwainUserInterfaceFlags.Show)' Save as PDF_ocrDocument.Save(pdfFileName, DocumentFormat.Pdf, Nothing)' Delete the document_ocrDocument.Dispose()' Show the result PDF fileSystem.Diagnostics.Process.Start(pdfFileName)End Sub
Finally add the following code for the Twain acquire handle:
private void _twainSession_AcquirePage(object sender, TwainAcquirePageEventArgs e){// We have a pageRasterImage image = e.Image;// First, run the image processing commands on itforeach (RasterCommand command in _imageProcessingCommands){command.Run(image);}// Create an OCR page for itusing (IOcrPage ocrPage = _ocrEngine.CreatePage(image, OcrImageSharingMode.AutoDispose)){// Recognize it and add it to the documentocrPage.Recognize(null);_ocrDocument.Pages.Add(ocrPage);}}
Private Sub _twainSession_AcquirePage(sender As Object, e As TwainAcquirePageEventArgs)' We have a pageDim image As RasterImage = e.Image' First, run the image processing commands on itFor Each command As RasterCommand In _imageProcessingCommandscommand.Run(image)Next' Create an OCR page for itUsing ocrPage As IOcrPage = _ocrEngine.CreatePage(image, OcrImageSharingMode.AutoDispose)' Recognize it and add it to the documentocrPage.Recognize(Nothing)_ocrDocument.Pages.Add(ocrPage)End UsingEnd Sub
Build, and Run the program to test it.
OCR Tutorial - Working with Pages
OCR Tutorial - Recognizing Pages
OCR Tutorial - Adding and Painting Zones
OCR Tutorial - Working with Recognition Results
Getting Started (Guide to Example Programs)
Programming with LEADTOOLS .NET OCR
An Overview of OCR Recognition Modules
Creating an OCR Engine Instance
Starting and Shutting Down the OCR Engine
Multi-Threading with LEADTOOLS OCR
OCR Spell Language Dictionaries
Using OMR in LEADTOOLS .NET OCR