OCR results file
Posted: Wed Aug 17, 2016 6:32 am
Hi,
I've built an invoice recognition/learning product that relies on the Recostar OCR engine. In particular, it processes the XML OCR results file created by Recostar. As I can't seem to find any way to license Recostar and because I'm wanting to build a web based point and click indexing solution, I'm considering GdPicture.
What I'd like to know is does Tesseract produce a similar kind of output file giving all characters and words along with their locations? Or do you have to use this command to get this information: PdfReaderGetPageTextWithCoords
Note that I searched for that command in the online documentation and I get no hits which is a bit of a worry?
Thanks, Turhan
I've built an invoice recognition/learning product that relies on the Recostar OCR engine. In particular, it processes the XML OCR results file created by Recostar. As I can't seem to find any way to license Recostar and because I'm wanting to build a web based point and click indexing solution, I'm considering GdPicture.
What I'd like to know is does Tesseract produce a similar kind of output file giving all characters and words along with their locations? Or do you have to use this command to get this information: PdfReaderGetPageTextWithCoords
Note that I searched for that command in the online documentation and I get no hits which is a bit of a worry?
Thanks, Turhan