Send comments on this topic. | Back to Introduction - All Topics | Help Version 16.5.03.01
Forms Recognition and Processing Concept

One of the key features of any successful ECM (Enterprise Content Management) system, specifically Document Imaging applications, is Forms Recognition and Processing. While these systems generally handle everything from the capture, management, storage, and delivery of documents, the Forms Recognition and Processing is a crucial element that can optimize the entire workflow.

Forms Recognition is the process of taking filled forms and automatically determining which type of form it belongs to. Forms Processing is the process of automatically extracting key information from a filled form (name, address, date, social security, etc). The automation of both these technologies has replaced what was previously done manually, form by form. This allows companies to process more forms in a given period, save money, and become as efficient as possible.

In most cases, a typical workflow will begin with the creation of the actual form to be processed, and end with that data being stored in a database for later retrieval, report generation, etc.

Form Creation – This is where the actual form is created and all relevant information and fields are added to the form.

Distribution – The forms are distributed to the users to be filled in. Documents can be distributed electronically, or on paper.

Input/Capture – Documents can be captured in a variety of ways (see below). Once the document is captured, it is sent to the Document Management System for processing.

  • Scan/Fax
  • Importing of electronically filled documents. This could be filled PDF’s, WORD documents, spreadsheets, etc.
  • Using existing electronic documents.

Image Cleanup – In order to maximize recognition and processing results, the document should be as clean as possible. LEADTOOLS provides an extensive set of processing methods to remove common problems from scanned or faxed images such as line removal, hold punch removal, line removal, dot removal, etc.

Recognition - Once the image is clean, it is ready for recognition. At this point, we will attempt to recognize which type of form this form belongs to. For this process, we can use several technologies including barcode, OCR, and other unique technologies created by LEADTOOLS. Any form which could not be recognized will be flagged and manually checked at a later time.

Processing - Now that we know what type of form we have, we know what information should be extracted and where it is located on the form. We will now extract all relevant information including barcode data, customer filled data (name, address, date, social, signature, logos, etc). Several technologies including OCR, ICR, OMR and others are used to extract the data. Any form which could not be processed will be flagged and manually checked at a later time.

Quality Assurance – In some cases, forms may not be able to be recognized or processed. This can occur under several conditions including a low quality scan/fax, the form has not been added to the master collection, the document was incomplete, or it was poorly filled. A quality assurance agent will manually inspect these files and decide whether they should be recognized and processed, or they need to be created again.

Output - We now move on to the output phase. The output phase generally will take the extracted data and do something with it. This can be in the form of storage and archival, emailing the results, generating reports, launching other processes, etc. We will also store the original document in the most efficient format possible such as LEADTOOLS ABC. In situations where standard formats are desired, you can also use TIFF, PDF, JPEG, and many more. LEADTOOLS currently supports over 140 different formats