Split PDF at given pages based on text search

Example requests & Code samples for GdPicture Toolkits.
Post Reply
cwiernik
Posts: 5
Joined: Wed Aug 15, 2012 5:00 pm

Split PDF at given pages based on text search

Post by cwiernik » Wed Aug 15, 2012 8:05 pm

I would like to make a PDF Content Splitter program.

It would search for a specific text string (tag) as defined. For example:

<<Print>>
<<Email:myemail@email.com>>
<<Fax:1999999999>>

The program would do the following:

1. Search for the << code, then search for the next >> and extract the text within the << >>
2. Based on the position of the << >> codes, split the PDF into separate PDFs based on the portion of the PDF from the start of one << >> location until the next << >> location.
3. The saved PDF will remove the the << >> and the text between the << >>.
4. The Filename will be somewhat based on the text within the << >> and the original filename

This is similar to some programs out there like A-PDF Content Splitter and some of the functionality of the PDF--Explode product.

We need to take a specially coded PDF, locate and split/burst the PDF into separate PDFs and then print/fax/email the separate PDFS, which code we already have. We currently generate a complete PDF and by program, create separate PDF's, but this has to be done specially for each separate report created by the report generator. To automatically split the report based on a text search would be alot easier.

Cliff

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Split PDF at given pages based on text search

Post by Loïc » Mon Sep 10, 2012 7:39 pm

Hello Cliff,

Sorry for the delay.
Actually the only way is to use hidden text. This can be not suitable if you subsequent start text extraction or if you want to modify the text content.

The next 9.2 release that will be published on Wednesday will includes 3 new methods to simplify such task:
- SetPagePrivateTag()
- GetPagePrivateTag()
- DeletePagePrivateTag()

What I can suggest you is to wait until the release is available and make a try through these functions. Of course, we will be able to provide you some guidance if you are not able to clearly implement what you expect.

With best regards,

Loïc

cwiernik
Posts: 5
Joined: Wed Aug 15, 2012 5:00 pm

Re: Split PDF at given pages based on text search

Post by cwiernik » Wed Sep 12, 2012 2:48 pm

I will look at this. It is not a problem to make the text hidden (white foreground) and then it does not need to be removed. The << xxxx >> tag can be at the top of each page. I just need to be able to search for it and split on the pages to separate PDF's but also need to know the contents of the tag so I can process the resulting split PDF file properly.

Since I don't program directly in vb, an example that I can convert to Xbase++ would be helpful.

Just search for a tag, split page and save tag value to variable and display tag.

Cliff.

cwiernik
Posts: 5
Joined: Wed Aug 15, 2012 5:00 pm

Re: Split PDF at given pages based on text search

Post by cwiernik » Thu Sep 13, 2012 9:15 pm

I looked at these methods. They appear to be used if you write a private tag into the file and then you can get the data. However, the PDF I would be searching for would be generated by a different PDF generator, a report writer. I am not certain on how these functions would be used in the specific instance presented.

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Split PDF at given pages based on text search

Post by Loïc » Fri Sep 14, 2012 11:33 am

I would be searching for would be generated by a different PDF generator, a report writer
OK so I can not really provide a snippet since you are not using GdPicture to generate the file.

The basic steps are:

1- With your generator, find a way to draw a text string.
2- From GdPicutre, just extract the page text and search for the text string.
3- Do your stuff according the result of 2.

I can not really suggest more.

With best regards,

Loïc

cwiernik
Posts: 5
Joined: Wed Aug 15, 2012 5:00 pm

Re: Split PDF at given pages based on text search

Post by cwiernik » Fri Sep 14, 2012 8:43 pm

Thanks. I think that should be able to do it.

Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests