GdPicturePDF GetPageText()

Discussions about PDF management.
Post Reply
gtoledo
Posts: 46
Joined: Thu May 28, 2009 7:30 pm

GdPicturePDF GetPageText()

Post by gtoledo » Wed Aug 15, 2018 12:49 am

Hello,
Can someone help me solve the following problem? ... I have several PDF files to which I need to extract the text, I'm using GdPicturePDF.GetPageText () but I could not get the full text of the document, only some areas.

This is the code and attached PDF file.

Regards

Code: Select all

private string GetTextPdfFile(){
	GdPicturePDF oGdPicturePDF = new GdPicturePDF();
	
    	string filePDF = @"C:\Temp\Endoso Niv-Pol 000000001 CIRUGIA DE NARIZ Y_O SENOS PARANASALES.pdf";
	string pageText = '';
	
	if (oGdPicturePDF.LoadFromFile(filePDF, false) == GdPictureStatus.OK)
    	{
		oGdPicturePDF.SelectPage(1);
		pageText = oGdPicturePDF.GetPageText();
	}
	oGdPicturePDF.CloseDocument();
		
	return pageText;
}
Attachments
Endoso Niv-Pol 000000001 CIRUGIA DE NARIZ Y_O SENOS PARANASALES.pdf
(1.38 MiB) Downloaded 274 times

Gabriela
Posts: 436
Joined: Wed Nov 22, 2017 9:52 am

Re: GdPicturePDF GetPageText()

Post by Gabriela » Tue Sep 18, 2018 1:57 pm

Hello,

It is not quite clear what do you mean by this: "but I could not get the full text of the document, only some areas."
Using our latest release all text on the page is extracted properly, you can find an example here:
https://www.gdpicture.com/guides/gdpicture/web ... eText.html

Post Reply

Who is online

Users browsing this forum: Google [Bot] and 1 guest