Sometimes specific backgrounds can interfere with an OCR operation. This code tip shows how to use a double-pass of image preprocessing functionality to preserve text while removing a bothersome background. Consider the following image:

Note the "LEADTOOLS" text at the top of the picture. This image isn't suitable for OCR not due to how close the text is to the color behind it, but due to the magnitude of the fade from light to dark. This fade toward the darkness at the bottom of the image skews the threshold value for the binarization to such an extent that the text would be considered just as light as its surrounding background such that it would never survive the default processing required by the OCR engine.

However, the text can be preserved by selectively applying the ContrastBrightnessIntensityCommand in two steps. Here's our documentation on the command:

The first pass will be to diminish the background while preserving the text. This is done by completely reducing the contrast and brightness, but maximizing the intensity. This reduces the contrast, and dims the image, while making the bright pixels brighter.

RasterCodecs codecs = new RasterCodecs();
RasterImage image = codecs.Load(@"grad1.png");
ContrastBrightnessIntensityCommand cbic1 = new ContrastBrightnessIntensityCommand(-1000, -1000, 1000);

Here's the output produced.

This is still insufficient for OCR. However, running the command again on the modified image, with adjusted parameters, reduces the background farther.

ContrastBrightnessIntensityCommand cbic2 = new ContrastBrightnessIntensityCommand(1000, -400, 1000);

This increases the contrast introduced in the last step between the text and the background behind it, dims the image farther, yet increases the intensity to preserve the bright areas. Here's the results of this.

At this point the text is, relatively, the brightest part of the image. Here's the ultimate processing image which is passed to the OCR engine.

This procedure can be modified based on the content of the image the preprocessing necessary.

Edited by moderator Thursday, June 14, 2018 9:09:02 AM(UTC)

Nick Crook
Developer Support Engineer
LEAD Technologies, Inc.

