Send comments on this topic. | Back to Introduction - All Topics | Help Version 16.5.9.25
Forms Recognition and Processing Workflow

Figure 1. ECM (Enterprise Content Management)

One of the key features of any successful ECM (Enterprise Content Management) system, specifically Document Imaging applications, is Forms Recognition and Processing. While ECM systems generally handle everything from capture, management, storage, and delivery of documents, Forms Recognition and Processing is a crucial element that can optimize the entire workflow.

Forms Recognition is the process of taking filled out forms and automatically determining which type of form it belongs to. Forms Processing is the process of automatically extracting key information from a filled out form (name, address, date, social security, etc). The automation of both of these technologies has replaced what was previously done manually, form by form. Businesses are more efficient as they are able to process more forms in a given period, hence saving money.

In most cases, a typical workflow will begin with the creation of the actual form to be processed, and end with that data being stored in a database for later retrieval, report generation, etc.

Form Creation – This is where the actual form is created and all relevant information and fields are added to the form.

Distribution – The forms are distributed to the users to be filled in. Documents can be distributed electronically, or on paper.

Input/Capture – Documents can be captured in a variety of ways (see below). Once the document is captured, it is sent to the Document Management System for processing.

Sources for Document Capture
  • Scanned/Faxed documents.
  • Imported electronically filled out documents. This could be filled PDF’s, WORD documents, spreadsheets, etc.
  • Existing electronic documents.

Image Cleanup – In order to maximize recognition and processing results, the document needs to be as clean as possible. LEADTOOLS provides an extensive set of processing methods to remove common problems from scanned or faxed images such as line removal, hole punch removal, line removal, dot removal, etc.

Recognition - Once the image is clean, it is ready for recognition. At this point, recognition can be attempted to determine which type of form the image is. For this process, several technologies can be used--including barcode, OCR, and other unique technologies created by LEADTOOLS. Any form which can not be recognized will be flagged and can be manually checked at a later time.

Processing - Once the form has been recognized as a specific type of form, we know what information needs to be extracted and where it is located on the form. All relevant information including barcode data, customer-filled data (name, address, date, social, signature, logos, etc) can be extracted. Several technologies including OCR, ICR, OMR and others are used to extract the data. Any form which can not be processed will be flagged and can be manually checked at a later time.

Quality Assurance – In some cases, forms may not be able to be recognized or processed. This can occur under several conditions: the scan\fax is low quality, the form has not been added to the master collection, the document was incomplete, the document was not filled out well, etc. A quality assurance agent will manually inspect these files and decide whether they should be recognized and processed, or they need to be recreated.

Output - Now the form is ready for the output phase. The output phase generally takes the extracted data and does something with it. This can be in the form of storing and archiving it, emailing the results, generating reports, launching other processes, etc. The original document can also be stored in the most efficient format possible such as LEADTOOLS ABC. If standard formats are desired, the forms can be stored as TIFF, PDF, JPEG, and many more. LEADTOOLS currently supports over 140 different formats.

See Also