Welcome Guest! To enable all features, please Login or Register.

Notification

Icon
Error

Options
View
Last Go to last post Unread Go to first unread post
#1 Posted : Friday, May 2, 2014 5:04:50 AM(UTC)
Nihmathullh

Groups: Registered
Posts: 1


How to make image pdf searchable without changing the format?

other details:

VS2010

.net 3.5
 

Try the latest version of LEADTOOLS for free for 60 days by downloading the evaluation: https://www.leadtools.com/downloads

Wanna join the discussion? Login to your LEADTOOLS Support accountor Register a new forum account.

#2 Posted : Friday, May 2, 2014 9:07:16 AM(UTC)

Hadi  
Hadi

Groups: Manager, Tech Support, Administrators
Posts: 214

Was thanked: 12 time(s) in 12 post(s)

You can make a PDF searchable by using OCR - Optical Character Recognition. You can find more information at the following link:
http://www.leadtools.com/help/leadtools/v18/dh/to/leadtools.topics~leadtools.topics.ocr.html

You
can load any supported file format into the Engine to perform OCR on
it. You can then save the results to any of the supported document formats listed here:
http://www.leadtools.com/help/leadtools/v18/dh/ft/leadtools.forms.documentwriters~leadtools.forms.documentwriters.documentformat.html

Here is a code snippet in C# on how to load an image and save it out as a PDF:
using (IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, true))
{ // Start the engine using default parameters ocrEngine.Startup(null, null, null, null);
// Create an OCR document using (IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument())
{ // Add a page to the document IOcrPage ocrPage = ocrDocument.Pages.AddPage(imageFileName, null);
// Recognize the page
// Note, Recognize can be called without calling AutoZone or manually adding zones. The engine will
// check and automatically auto-zones the page
ocrPage.Recognize(null);
// Save the document we have as PDF ocrDocument.Save(imageFileName, DocumentFormat.Pdf, null);
}
// Shutdown the engine
// Note: calling Dispose will also automatically shutdown the engine if it has been started
ocrEngine.Shutdown();
}
For more information I would recommend you check out our OCR tutorials located here:
http://www.leadtools.com/help/leadtools/v18/dh/to/leadtools.topics.forms.ocr~fo.topics.ocrtutorials.html

and the Ocr Namespace located here:
http://www.leadtools.com/help/leadtools/v18/dh/fo/leadtools.forms.ocr~leadtools.forms.ocr_namespace.html
Hadi Chami
Developer Support Manager
LEAD Technologies, Inc.

LEAD Logo
 
#3 Posted : Thursday, May 8, 2014 2:45:59 AM(UTC)
Nihmathullh

Groups: Registered
Posts: 1


Thanks for your reply.

Actually the code given here is for converting loaded image to PDF.

Our scenario is that we want to convert scanned pdf to searchable pdf.
 
#4 Posted : Thursday, May 8, 2014 4:50:46 AM(UTC)

Hadi  
Hadi

Groups: Manager, Tech Support, Administrators
Posts: 214

Was thanked: 12 time(s) in 12 post(s)

You can use this same code to load any image (PDF included) and save it out as any of the supported document formats (PDF included).
In the sample I sent you above, the imageFileName is the path to any image, so you can set it to be your scanned PDF.
You can find a demo to test this out at the following toolkit location: C:\LEADTOOLS 18\Shortcuts\.NET Class Libraries\.NET Framework\02 Document\03 OCR - ICR - OMR\01 Main Demo
You can use this demo to load any image (including PDF) into the viewer, then you can recognize and save it out as a searchable PDF.
Hadi Chami
Developer Support Manager
LEAD Technologies, Inc.

LEAD Logo
 
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

Powered by YAF.NET | YAF.NET © 2003-2020, Yet Another Forum.NET
This page was generated in 0.132 seconds.