Sample Programm doesn't produce searchable pdf file

Discussions about machine vision support in GdPicture.
Post Reply
luke92
Posts: 6
Joined: Tue Apr 05, 2011 2:45 pm

Sample Programm doesn't produce searchable pdf file

Post by luke92 » Tue Apr 05, 2011 2:51 pm

I tried to convert a pdf file to an pdf ocr file using the sample from GdPicture.Net "PDF to PDF-OCR".
I was able to produce a file, but the file isn't searchable for text. Do I have to modify the sample program to make it work?

Thanks for your help.

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Sample Programm doesn't produce searchable pdf file

Post by Loïc » Tue Apr 05, 2011 2:58 pm

Hi,

Please send the resulting PDF to https://www.gdpicture.com/support/getting-support-from-our-team for investigation.

If you provided the good dictionary path, the program should works.

Kind regards,

Loïc

luke92
Posts: 6
Joined: Tue Apr 05, 2011 2:45 pm

Re: Sample Programm doesn't produce searchable pdf file

Post by luke92 » Tue Apr 05, 2011 3:16 pm

I used the standard dictionary path C:\Programme\GdPicture.NET\Redist\Commons\OCR

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Sample Programm doesn't produce searchable pdf file

Post by Loïc » Tue Apr 05, 2011 3:17 pm

OK. Please send the produced PDF for investigation purpose.

User avatar
ryancole11
Posts: 21
Joined: Fri May 21, 2010 7:19 pm

Re: Sample Programm doesn't produce searchable pdf file

Post by ryancole11 » Wed May 04, 2011 9:15 pm

Can you please keep me informed about this? I am currently trying to use that example code to turn a non-searchable PDF into an OCR'd searchable PDF, also. The example code is not producing a searchable PDF. The example code only produces a PDF/A but does not have any embedded text. I know that it is at least performing the OCR operations with the dictionary files because each page takes a couple of seconds to process. There is no need for an example PDF because this does not work for any PDF that I test it with.

I am using C# and the .NET version of GdPicture Pro and Tesseract. Here's my code:

http://dpaste.org/uLWu/

Code: Select all

String dictionaries = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location) + @"\dictionaries";

// open the new pdf in the viewer
viewer.DisplayFromFile(out_file);

for (int x = 1; x <= viewer.PageCount; x++)
{
	Console.WriteLine("Performing image twain on page {0}", x);

	viewer.DisplayFrame(x);
	Int32 rasterized_page = viewer.GetNativeImage();

	if (x == 1)
		imaging.TwainPdfOCRStartEx(String.Format("{0}.ocr.pdf", out_file), "", "", "", "", "", PdfEncryption.PdfEncryptionNone, PdfRight.PdfRightCanModify);

	imaging.TwainAddGdPictureImageToPdfOCR(rasterized_page, TesseractDictionary.TesseractDictionaryEnglish, dictionaries);
}

// close the twaining
imaging.TwainPdfOCRStop();
viewer.CloseImage();

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Sample Programm doesn't produce searchable pdf file

Post by Loïc » Thu May 05, 2011 6:24 pm

Hi,

Please send a standalone application reproducing the issue + input and output PDF to https://www.gdpicture.com/support/getting-support-from-our-team

Kind regards,

Loïc

User avatar
ryancole11
Posts: 21
Joined: Fri May 21, 2010 7:19 pm

Re: Sample Programm doesn't produce searchable pdf file

Post by ryancole11 » Thu May 05, 2011 6:25 pm

Alright, give me about 30 minutes. I'm in the middle of something, at the moment.

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest