Get Contact Info from Business Card with OCR: 25 Projects in 25 Days

OCR Zones
OCR Zones
OCR Results
OCR Results

As part of the LEAD Technologies 25th anniversary, we are creating 25 projects in 25 days to celebrate LEAD’s depth of features and ease of use. Today’s project comes from Hadi.

If you’ve been keeping track, this is actually our 25th project and the last one in our #LEAD25 series. But don’t be sad, we post examples like these to our blog and forums on a regular basis so keep in touch! If you missed any of them, take a look back at our series introduction where we’ve kept a running list as each project was posted.

What it Does

This ASP.NET C# application recognizes zones of text from business cards using LEADTOOLS Version 19.

Features Used

Development Progress Journal

Hello, my name is Hadi I am writing a sample application that will allow users to upload an image of a business card and then let them select zones to be recognized using OCR for uploading the text as a contact.

I am using an ASP.NET application so that I can combine server-side C# code in the code behind and HTML5 and JavaScript on the client side. I will use the LEADTOOLS JavaScript ImageViewer and Annotation SDK in order to display the zones the user wants to recognize from the business card and on the server side I will use the .NET Annotations and OCR SDK.

To start I will need to create the aspx Default page for the user interface. I want the user to be able to upload an image, display the image, and then allow them to add/remove annotation rectangles to depict the desired zones.

On the server side, I want to be able to take the uploaded file and use the LEADTOOLS OCR AutoZone method in order to have some premade zones available for the user to manipulate on the front end. I also need to be sure that the file uploaded is a valid MIME type for display in the browser and if not convert it to one.

The AutoZone method gave me the zones and I used the LEADTOOLS Annotations to save an XML file that contains the zone bounds. This will let me easily display the zones to the user on the front end. I need to add a function in the JavaScript that will load the image and the XML zone file for displaying. I need to call the JavaScript from the server, so I used the ClientScriptManager.RegisterStartupScript method from the System.Web.Ui namespace.

Now that I have the XML filename in the JavaScript, I need to load it with AnnCodecs from the LEADTOOLS.Annotations.Core.JS namespace. I am using the XMLHttpRequest open and send methods to open the XML file and process the results in the onreadystatechange event. To load the image, I just set the ImageViewer.ImageUrl property to the file of the image I passed to the function. It was very easy to load the image with the LEADTOOLS ImageViewer:

Documentation: ImageViewer

Now that most of the functionality is implemented, I am going to add additional features that will allow the user to manipulate the zones. I want the user to be able to set the name of the zone so that they can know what the field belongs to. For this I am using the AnnObject.AnnLabel property:

Documentation: AnnLabel

A couple other features I have added allow the user to delete selected zones or clear all the zones. This is easily achievable using the AnnAutomation.DeleteObject and DeleteObjects methods:

Documentation: AnnAutomation

Now that the additional functionalities are implemented, the final step is to recognize the new zones. Since the project already had to load the image and use the OcrEngine to AutoZone, I kept those in memory on the server so that everything is ready. I just need to pass the new XML data back to the server using PageMethods and the .NET AnnCodecs to load the new XML and get the zones out of it. Once I update the zones, I call the OcrDocument.Recognize method and then alert the user with the recognized text.

Overall this app was fun to write because it taught me a lot about client and server interaction. It was made much simpler by using the LEADTOOLS libraries since any one aspect of it (Annotations, OCR, Viewer) would have been extremely hard to implement without it. If I had more time I would look into porting the recognized text into an actual Outlook or Google contact.

Download the Project

The source code for this sample project can be downloaded from here.

About 

LEAD Technologies, Inc.

This entry was posted in OCR and tagged , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *