OCR Failing

Discussions about machine vision support in GdPicture.
Post Reply
lbleicher
Posts: 16
Joined: Fri Nov 04, 2011 4:51 am

OCR Failing

Post by lbleicher » Tue May 22, 2012 5:48 am

Hi-

I have been using the code below to OCR PDF image files successfully for a while, but recently had to change envirnoments, and now I no longer get any text in the resulting PDF/A. Each page seems to get read and rendered, but the PdfAddGdPictureImageToPdfOCR statement does not seem to do anything.

Two questions:
Is there a way to tell if PdfAddGdPictureImageToPdfOCR is erroring (like a GetStat statement)?
What might cause PdfAddGdPictureImageToPdfOCR to fail silently? would (for example) having a missing dictionary cause an error?

Thanks,
Leo

Code: Select all

Dict = "eng"

        PdfID = oGdPictureImaging.PdfOCRStart(OutputFilePath, True, "", "", "", "", "DocDigester")
        oGdPictureImaging.OCRTesseractSetPassCount(2)

        If InputPDF.LoadFromFile(pdfPath, False) = GdPicture.GdPictureStatus.OK Then
            node.GetProperties().Define("GdP_PDF_Pages", InputPDF.GetPageCount())
            For i As Integer = 1 To InputPDF.GetPageCount()
                node.GetProperties().Define("Done Reading Page" & i, "True")
                InputPDF.SelectPage(i)
                ImageID = InputPDF.RenderPageToGdPictureImageEx(200, True)

                Dim pgText As String = oGdPictureImaging.PdfAddGdPictureImageToPdfOCR(PdfID, ImageID, Dict, sciroot & "apps\bin\win", "")
                oGdPictureImaging.ReleaseGdPictureImage(ImageID)
            Next i
        Else
            'report out reason for problem.
            Dim errCode As Integer = InputPDF.GetStat()
            node.GetProperties().Define("Error", pdfPath & "GdPicturePDF LoadFromFile Status not OK.  ErrCode = " & errCode)
        End If
        InputPDF.CloseDocument()
        oGdPictureImaging.PdfOCRStop(PdfID)

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: OCR Failing

Post by Loïc » Tue May 22, 2012 12:22 pm

Hello Leo,

Code: Select all

Dim pgText As String = oGdPictureImaging.PdfAddGdPictureImageToPdfOCR(PdfID, ImageID, Dict, sciroot & "apps\bin\win", "")
You should call oGdPictureImaging.GetState() to diagnose the error, if any.
What might cause PdfAddGdPictureImageToPdfOCR to fail silently? would (for example) having a missing dictionary cause an error?
this can be anything reported by the GdPictureStatus enumeration. Basically a memory issue, an invalid dictionary path (the highest probability), a licensing issue etc...

Let me know the status, I bet this is a dictionary problem.

Kind regards,

Loïc

lbleicher
Posts: 16
Joined: Fri Nov 04, 2011 4:51 am

Re: OCR Failing

Post by lbleicher » Tue May 22, 2012 7:19 pm

Hi Loic-

Thanks for the suggestion.

I added : gdpStat = oGdPictureImaging.GetStat()

after the PdfAddGdPictureImageToPdfOCR statement.

For each page the returned value is 0 (OK)! However, there is still no text added. :(

If I move the dictionary file I get status 801 (OCRDictionaryNotFound), which is correct.

I am certain my test document has text, as I have a processed version from January where the text has been added.

Any other suggestions?

Thanks,
Leo

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: OCR Failing

Post by Loïc » Wed May 23, 2012 1:16 pm

Hello,

You should first check you are using our latest version.
If the problem persists with this version, we will need a standalone application reproducing the issue and the document used.

Kind regards,

Loïc

lbleicher
Posts: 16
Joined: Fri Nov 04, 2011 4:51 am

Re: OCR Failing

Post by lbleicher » Wed May 23, 2012 6:25 pm

Hi Loic-

I am using
GdPicture.NET 8 (8.5.0.0)
GdPicture Tesseract 2 OCR Plugin 2.0.0.5 (12/9/2011)
GdPicture.NET PDF Plugin 1.0.0.22

Thanks,
Leo

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: OCR Failing

Post by Loïc » Wed May 23, 2012 7:51 pm

Well I think you should try the 8.5.29 :)

You can also send us the file for cross checking. If there is confidential data just send it over https://www.gdpicture.com/support/getting-support-from-our-team

Regards,

Loïc

lbleicher
Posts: 16
Joined: Fri Nov 04, 2011 4:51 am

Re: OCR Failing

Post by lbleicher » Sat May 26, 2012 11:36 pm

Hi Loic-

Upgrading to 8.5.29 has fixed the issue.

Thanks,
Leo

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest