Page 1 of 1

OCR Multipage TIFF with Rotated Pages

Posted: Mon Aug 21, 2017 3:30 pm
by nieho003
Hi,

I'm trying to use the OCR plugin to convert the multipage TIFFs we create while scanning into OCR'd PDFs. This is working well when we scan in files with a standard orientation, but we sometimes need to scan in a way that the text is rotated 90 degrees. When we try to OCR these, only a few random characters are identified.

Is it possible to have rotated pages be OCR'd while keeping the scanned rotation? I'm currently using the following code for our OCR:

Code: Select all

        private static GdPictureImaging imaging = new GdPictureImaging();
	private static string imageDictionaryDirectory = @"{Path to Dictionary}";

        public static bool OcrTiffToPdf(string inputFile, string outputPdf)
        {
            var imageID = imaging.CreateGdPictureImageFromFile(inputFile);         

            if (imaging.GetStat() == GdPictureStatus.OK)
            {
                string ocr = imaging.PdfOCRCreateFromMultipageTIFF(imageID, "eng", imageDictionaryDirectory, String.Empty,
                    outputPdf, true, String.Empty, String.Empty, String.Empty, String.Empty, String.Empty);

                imaging.ReleaseGdPictureImage(imageID);

                if (String.IsNullOrWhiteSpace(ocr))
                {
                    return false;
                }
            }

            if (imaging.GetStat() != GdPictureStatus.OK)
            {
                return false;
            }

            return true;
        }
        

Re: OCR Multipage TIFF with Rotated Pages

Posted: Fri Aug 25, 2017 3:57 pm
by Loïc
Hi,

You should convert the tiff to PDF. The start ocr into the PDF using the OCRPages() of the GdPicturePDF class which automatically detects the orientation.

I hope this helps.

Kind regards,

Loïc

Re: OCR Multipage TIFF with Rotated Pages

Posted: Mon Aug 28, 2017 9:17 pm
by nieho003
As I understand it, that requires the PDF plugin, and we only have the imaging and OCR plugin. We don't really have any other use for the PDF plugin.

Is there a way to achieve this without the PDF plugin?

Re: OCR Multipage TIFF with Rotated Pages

Posted: Mon Aug 28, 2017 9:44 pm
by nieho003
I see that there is a way to automatically rotate pages of a TIFF:

viewtopic.php?t=4893

But I need to maintain the original orientation in the end PDF. Would there be a way to process a TIFF page by page: rotate, OCR, rotate back, and somehow maintain that OCR and make that into a PDF?

Re: OCR Multipage TIFF with Rotated Pages

Posted: Thu Aug 31, 2017 3:52 pm
by Loïc
Hi,

Unfortunately this is not possible without the PDF plugin, this is a high level feature only supported through this Plugin.

Kind regards,

Loïc