Tutorial: Auto Recognize and Process a Form

Processing forms and invoices are a large part of many companies day-to-day workflow. When a copy of a form is filled out by a person and scanned back into the company, that information then needs to be extracted. Many OCR engines struggle to extract this information since the form could have been scanned in at a lower resolution than the original, could have noise introduced by the scanner, or the fields may be unstructured and dynamically generated. Thankfully, the LEADTOOLS Forms Recognition SDK takes care of all of that and eliminates the need for any additional manual processing. Powered by LEAD’s patented machine learning algorithms, these advanced forms recognition and OCR libraries handle both structured and unstructured forms and can help save companies valuable time and money.

The primary components of quick and accurate forms recognition comes from two LEADTOOLS engines, the AutoFormsEngine and the IOcrEngine. The AutoFormsEngine provides high level form recognition and processing functionalities to recognize, process, and create forms while the IOcrEngine is the entry point to all OCR functionality provided by LEADTOOLS.

The code below shows the core of what is needed to get a .NET forms recognition and OCR application running. If you want a complete step-by-step tutorial, head over to the LEADTOOLS documentation for the Auto Recognize and Process a Form tutorial.

// Add these global members
static AutoFormsEngine autoEngine; 
static RasterCodecs codecs; 
static IOcrEngine ocrEngine; 
static DiskMasterFormsRepository formsRepository;

// Initialize the Engines
static void InitFormsEngines()
{
	codecs = new RasterCodecs();
	
	ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD, false);
	ocrEngine.Startup(codecs, null, null, @"C:\LEADTOOLS 20\Bin\Common\OcrLEADRuntime");
	
	formsRepository = new DiskMasterFormsRepository(codecs, @"C:\Users\Public\Documents\
	LEADTOOLS Images\Forms\MasterForm Sets\OCR");
	autoEngine = new AutoFormsEngine(formsRepository, ocrEngine, null, 
	AutoFormsRecognitionManager.Default | AutoFormsRecognitionManager.Ocr, 30, 80, true);
}

// Recognize and Process a Form
static void RecognizeAndProcessForm()
{
	string resultMessage = "Form not recognized";
	string formToRecognize = @"C:\Users\Public\Documents\LEADTOOLS Images\Forms\Forms to be Recognized\
	OCR\W9_OCR_Filled.tif";

	AutoFormsRunResult runResult = autoEngine.Run(formToRecognize, null);
	if (runResult != null)
	{
	   FormRecognitionResult recognitionResult = runResult.RecognitionResult.Result;
	   resultMessage = $@"This form has been recognized as a 
	   {runResult.RecognitionResult.MasterForm.Name} with {recognitionResult.Confidence} confidence.";
	}

	Console.WriteLine("Recognition Results:");
	Console.WriteLine("=========================================================================");
	ShowProcessedResults(runResult);
}

// Print the output of the results
private static void ShowProcessedResults(AutoFormsRunResult runResult)
{
   string resultsMessage = "";

    foreach (FormPage formPage in runResult.FormFields)
        foreach (FormField field in formPage)
            if (field != null)
                resultsMessage = $"{resultsMessage}{field.Name} = 
                {(field.Result as TextFormFieldResult).Text}\n";

    Console.WriteLine("Field Processing Results:");
    Console.WriteLine(resultsMessage);
}

Try it out!

To test this for yourself, make sure to get the latest LEADTOOLS SDK code for free straight from our site if you have not already. This trial is good for 60 days and comes with unlimited chat and email support.

Support

Need help getting this sample up and going? Contact our support team for free technical support! For pricing or licensing questions, you can contact our sales team (sales@leadtools.com) or call us at 704-332-5532.


If you haven’t already read our prior post on how to Create a Multipage File from Multiple Images, check that out and stay tuned for more. We’ll be featuring a lot more tutorials that programmers can use to develop applications that will directly impact data capture, recognition, exchange, and other pressing business needs.

This entry was posted in Forms Recognition and Processing, OCR. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *