Tonal range is an important attribute of an image, especially photographs, when trying to improve OCR results. Tonal range is the range of tones between the lightest and darkest areas of an image and is also known as contrast. An image with a wide range has both very dark (black) and very light (white) elements. An image with a narrow range is more limited in its tonal scope, which is usually in the mid-ranges (varying shades of grays).
An image with a tonal range of anything greater than about forty percent can cause inaccurate recognition during OCR. The precentage is calculated by looking at the PixelCount
and then dividing that number by the TotalPixelCount
. The higher the percentage, the darker the pixels. If the image’s text color is black or a dark color, and you have a high percentage, then the text has a possibility of blending in with the rest of the image’s pixels. In this case, the OCR engine will have a hard time separating these pixels.
Included amongst other numerous image processing functions within the LEADTOOLS Image Processing SDK, there is the StatisticsInformationCommand
class, which can be used to return statistical information about the image and the image’s pixels. This class has a Percent Property
that will get the percent value of the tonal range found in the image. As mentioned before, you want this percentage low when dealing with text extraction from documents.
Images that have a high percentage can be easily fixed with two lines of code and two image processing commands
StretchIntensityCommand
- Increases the contrast in an image by centering, maximizing, and proportioning the range of intensity values
ChangeContrastCommand
- Increases or decreases the contrast of the image. Valid values are -1000 to +1000.
public static void DoCleanUp(RasterImage image)
{
// Run stretch intensity to make the darkest color black
new StretchIntensityCommand().Run(image);
// Then increase the contrast to decrease the midtones
new ChangeContrastCommand(){Contrast = 1000}.Run(image);
}
Checking the tonal range is beneficial when performing batch OCR processes because it is a fast and automated way to give the OCR engine the best chance to produce great results.
In the below example of the “Before and After” gif, I demonstrate the difference between an image with a high percentage versus an image with a low percnetage.
A C# project, written by a Developer Support Engineer, can be downloaded from the LEADTOOLS Forums