Welcome Guest! To enable all features, please Login or Register.

Notification

Icon
Error

Options
View
Last Go to last post Unread Go to first unread post
#1 Posted : Tuesday, September 12, 2006 7:00:23 AM(UTC)
pdiermen

Groups: Registered
Posts: 3


Is it possible with Leadtools that given a PDF which contains a searchable image to extract the different parts (image, text and hit-highlight information) ?

Thanks in advance.

Best regards,
DEVENTit BV
Peter van Diermen

 

 

Try the latest version of LEADTOOLS for free for 60 days by downloading the evaluation: https://www.leadtools.com/downloads

Wanna join the discussion? Login to your LEADTOOLS Support accountor Register a new forum account.

#2 Posted : Wednesday, September 13, 2006 11:18:39 PM(UTC)
Maen Hasan

Groups: Registered, Tech Support
Posts: 1,326

Was thanked: 1 time(s) in 1 post(s)

Hello,

Do you mean that you need to get information about the different parts of the PDF document (image, text, etc.)?
If yes, please provide me with the following information:
- What is the exact LEADTOOLS version that you use?
- What is the programming interface (COM, API, .Net) that you use?
- Please provide me with more details about what you are trying to do?

Thanks,
Maen Badwan
LEADTOOLS Technical Support
 
#3 Posted : Thursday, September 14, 2006 1:47:42 AM(UTC)
pdiermen

Groups: Registered
Posts: 3


Mean,

We are using Leadtools v14 for .NET. What we want to be able to do is that given a searchable image PDF we want to decompose/extract the information to:

  • the image that is contained in the PDF
  • the text information which is available for searching in the PDF
  • and the positioning information of every word which is used in the PDF to highlight the terms searched for (on the image).

Thanks in advance.

 
#4 Posted : Sunday, September 17, 2006 8:25:59 PM(UTC)
Maen Hasan

Groups: Registered, Tech Support
Posts: 1,326

Was thanked: 1 time(s) in 1 post(s)

Hello,

We don't have functions that load the contents of the PDF file as text, drawing and image objects. What we have is the Raster PDF Plug-in, which enables you to load the full page as a raster image. Any images/text/drawing objects will be merged into one image when loading a page.

If you have our OCR module (part of LEADTOOLS Document Imaging Suite), you can take the resulting image and OCR it to obtain text information. However, this does not guarantee you will get the exact same text from the PDF, because there are different factors and special cases. For example, some PDF files might contain 'hidden' text that although you can use it to search text in Acrobat, this text does not appear in the 'rasterized' image loaded by LEADTOOLS.

Thanks,
Maen Badwan
LEADTOOLS Technical Support
 
#5 Posted : Monday, September 18, 2006 2:41:36 AM(UTC)
pdiermen

Groups: Registered
Posts: 3


Mean, thanks for this information

 
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

Powered by YAF.NET | YAF.NET © 2003-2024, Yet Another Forum.NET
This page was generated in 0.070 seconds.