IsBlank Method (GdPictureImaging)
IsBlank Method (GdPictureImaging)
Hi,
Is there any different between V11 and V14 regarding IsBlank Method?
As I have just upgraded to V14 to utilise OCRPages method but just noticed that the IsBlank method returns different results for the same Multipage TIF image. The method was called as follows:
if (m_GdPictureImaging.IsBlank(m_ImageID, float.Parse(this.tifToPDFConfig.DeleteBlanksBlankPixelThresholdInt), true))
this.tifToPDFConfig.DeleteBlanksBlankPixelThresholdInt was set to 99.95
Note: before making this post, I have downloaded the latest version 14.0.0.27. The version that had the different result was 11.2.0.7.
Regards
Is there any different between V11 and V14 regarding IsBlank Method?
As I have just upgraded to V14 to utilise OCRPages method but just noticed that the IsBlank method returns different results for the same Multipage TIF image. The method was called as follows:
if (m_GdPictureImaging.IsBlank(m_ImageID, float.Parse(this.tifToPDFConfig.DeleteBlanksBlankPixelThresholdInt), true))
this.tifToPDFConfig.DeleteBlanksBlankPixelThresholdInt was set to 99.95
Note: before making this post, I have downloaded the latest version 14.0.0.27. The version that had the different result was 11.2.0.7.
Regards
Re: IsBlank Method (GdPictureImaging)
Hi,
We have changed the blank page detection in order to improve its accuracy. This change might request a slight change of the threshold under certain circumstances.
Feel free to share the impacted images in case you need us to have a look.
Regards,
David
We have changed the blank page detection in order to improve its accuracy. This change might request a slight change of the threshold under certain circumstances.
Feel free to share the impacted images in case you need us to have a look.
Regards,
David
Re: IsBlank Method (GdPictureImaging)
Hi David,
I attached my test images. This is the process that we have been using with Ver 11 for TIF to PDF Ocr:
1. Remove punch holes on even pages only using GdPictureImaging.RemoveHolePunch
- Note: Ver 11 does not have AccountForPunchHoles in IsBlank method and with Ver 14 I can remove this function.
2. Auto rotate
3. Detect and delete blank page on even pages only using GdPictureImaging.IsBlank(m_ImageID, float.Parse(this.tifToPDFConfig.DeleteBlanksBlankPixelThresholdInt), true)
Add or delete page based on IsBlank result with threshold = 99.95:
- GdPictureImaging.TiffAddToMultiPageFile(m_ImageID, i, tiffcompression)
- GdPictureImaging.TiffDeletePage(m_ImageID, i)
4. OCR using GdPictureImaging.PdfOCRCreateFromMultipageTIFF
**Note:
Two tests was run with referencing to Ver 11 and 14 with exact same setting and Ver 11 seems to be better.
You mentioned the the blank page detection was changed. So is there an equivalent scale? As we have been using different settings for different groups of document.
I am looking forward to be able to use V14 with its OCRPages method and PDFCompression settings.
Regards,
Tri
I attached my test images. This is the process that we have been using with Ver 11 for TIF to PDF Ocr:
1. Remove punch holes on even pages only using GdPictureImaging.RemoveHolePunch
- Note: Ver 11 does not have AccountForPunchHoles in IsBlank method and with Ver 14 I can remove this function.
2. Auto rotate
3. Detect and delete blank page on even pages only using GdPictureImaging.IsBlank(m_ImageID, float.Parse(this.tifToPDFConfig.DeleteBlanksBlankPixelThresholdInt), true)
Add or delete page based on IsBlank result with threshold = 99.95:
- GdPictureImaging.TiffAddToMultiPageFile(m_ImageID, i, tiffcompression)
- GdPictureImaging.TiffDeletePage(m_ImageID, i)
4. OCR using GdPictureImaging.PdfOCRCreateFromMultipageTIFF
**Note:
Two tests was run with referencing to Ver 11 and 14 with exact same setting and Ver 11 seems to be better.
You mentioned the the blank page detection was changed. So is there an equivalent scale? As we have been using different settings for different groups of document.
I am looking forward to be able to use V14 with its OCRPages method and PDFCompression settings.
Regards,
Tri
- Attachments
-
- TestDoc.zip
- (1.99 MiB) Downloaded 446 times
Re: IsBlank Method (GdPictureImaging)
Hi,
What are we supposed to do with the image? Could you provide a code snippet and describe the expected result vs the obtained result?
Kind regards,
Loïc
What are we supposed to do with the image? Could you provide a code snippet and describe the expected result vs the obtained result?
Kind regards,
Loïc
Re: IsBlank Method (GdPictureImaging)
Hi Loïc,
I apologise for late response.
As described in my first post, the IsBlank method returns different result after updating from V11 to V14 (using same threshold 99.95 ) which blank pages detected using V11 were not detected using V14.
David (in one of reply posts) mentioned that the Blank Page Detection was changed to improve accuracy; hence the change of the threshold.
Improving the method is good but we do not expect the change of threshold as it wont be practical to just "blind" upgrade to V14.
Is there any equivalent scale between the two version?
Is there a way where we could use the same IsBlank method of V11 and Ocrpages method of V14?
Here is the code snippet:
I apologise for late response.
As described in my first post, the IsBlank method returns different result after updating from V11 to V14 (using same threshold 99.95 ) which blank pages detected using V11 were not detected using V14.
David (in one of reply posts) mentioned that the Blank Page Detection was changed to improve accuracy; hence the change of the threshold.
Improving the method is good but we do not expect the change of threshold as it wont be practical to just "blind" upgrade to V14.
Is there any equivalent scale between the two version?
Is there a way where we could use the same IsBlank method of V11 and Ocrpages method of V14?
Here is the code snippet:
Code: Select all
using (GdPictureImaging m_GdPictureImaging = new GdPictureImaging())
{
int m_ImageID;
m_GdPictureImaging.TiffOpenMultiPageForWrite(true); // For performance
m_ImageID = m_GdPictureImaging.CreateGdPictureImageFromFile(sourcePath);
_PDFtoOCR.NewPDF(this.tifToPDFConfig.PDFA);
_PDFtoOCR.OcrPagesProgress += this.OcrPagesProgress;
_PDFtoOCR.OcrPagesDone += this.OcrPagesDone;
float resolution = System.Math.Max(this.tifToPDFConfig.PDFDPI_IfLessThen_Int_ElseOriginal, m_GdPictureImaging.GetVerticalResolution(m_ImageID));
switch (this.tifToPDFConfig.PDFCompressionBitonalSetting.ToUpper())
{
case "CCITT4":
_PDFtoOCR.SetCompressionForBitonalImage(PdfCompression.PdfCompressionCCITT4);
break;
case "JBIG2":
_PDFtoOCR.SetCompressionForBitonalImage(PdfCompression.PdfCompressionJBIG2);
break;
default:
break;
}
switch (this.tifToPDFConfig.PDFCompressionColorSetting.ToUpper())
{
case "JPEG":
_PDFtoOCR.SetCompressionForColorImage(PdfCompression.PdfCompressionJPEG);
_PDFtoOCR.SetJpegQuality(this.tifToPDFConfig.PDFJPEGQuality_0_Worse_100_Best);
break;
case "JPEG2000":
_PDFtoOCR.SetCompressionForColorImage(PdfCompression.PdfCompressionJPEG2000);
_PDFtoOCR.SetJpeg2000Quality(this.tifToPDFConfig.PDFJPEG2000Quality_1_Best_512_Worse);
break;
default:
break;
}
if (m_GdPictureImaging.TiffIsMultiPage(m_ImageID))
{
#region Tiff is multipage
int NumberofDeletedPages = 0;
int NumberOfPages = m_GdPictureImaging.TiffGetPageCount(m_ImageID);
//loop through pages
for (int i = 1; i <= NumberOfPages; i++)
{
//select each page in TIFF file
m_GdPictureImaging.TiffSelectPage(m_ImageID, i);
if (this.tifToPDFConfig.Detect_CompressBW == true)
{
m_GdPictureImaging.ColorDetection(m_ImageID, true, true, true);
}
if (this.tifToPDFConfig.BrightenColourInt > 0)
{
m_GdPictureImaging.SetBrightness(m_ImageID, this.tifToPDFConfig.BrightenColourInt);
}
if (this.tifToPDFConfig.RequiredResolution > 0)
{
ResizePage(m_ImageID, m_GdPictureImaging);
}
// 2016-10-20 : L. Oliver : Start of Addition
if (this.tifToPDFConfig.PunchHoleRemovalAllOddEvenNone.ToLower() == "all" ||
(this.tifToPDFConfig.PunchHoleRemovalAllOddEvenNone.ToLower() == "odd" && (i % 2 != 0)) ||
(this.tifToPDFConfig.PunchHoleRemovalAllOddEvenNone.ToLower() == "even" && (i % 2 == 0)))
{
List<String> lstPunchHoleMargins = new List<String>(this.tifToPDFConfig.PunchHoleLocationsRightLeftTopBottom.ToLower().Split(new char[] { ',' }));
if (lstPunchHoleMargins.Contains("left") && lstPunchHoleMargins.Contains("right") &&
lstPunchHoleMargins.Contains("top") && lstPunchHoleMargins.Contains("bottom"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginLeft | HolePunchMargins.MarginRight | HolePunchMargins.MarginTop | HolePunchMargins.MarginBottom);
else if (lstPunchHoleMargins.Contains("left") && lstPunchHoleMargins.Contains("right") && lstPunchHoleMargins.Contains("top"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginLeft | HolePunchMargins.MarginRight | HolePunchMargins.MarginTop);
else if (lstPunchHoleMargins.Contains("left") && lstPunchHoleMargins.Contains("right") && lstPunchHoleMargins.Contains("bottom"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginLeft | HolePunchMargins.MarginRight | HolePunchMargins.MarginBottom);
else if (lstPunchHoleMargins.Contains("left") && lstPunchHoleMargins.Contains("top") && lstPunchHoleMargins.Contains("bottom"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginLeft | HolePunchMargins.MarginTop | HolePunchMargins.MarginBottom);
else if (lstPunchHoleMargins.Contains("right") && lstPunchHoleMargins.Contains("top") && lstPunchHoleMargins.Contains("bottom"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginRight | HolePunchMargins.MarginTop | HolePunchMargins.MarginBottom);
else if (lstPunchHoleMargins.Contains("left") && lstPunchHoleMargins.Contains("right"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginLeft | HolePunchMargins.MarginRight);
else if (lstPunchHoleMargins.Contains("left") && lstPunchHoleMargins.Contains("top"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginLeft | HolePunchMargins.MarginTop);
else if (lstPunchHoleMargins.Contains("left") && lstPunchHoleMargins.Contains("bottom"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginLeft | HolePunchMargins.MarginBottom);
else if (lstPunchHoleMargins.Contains("right") && lstPunchHoleMargins.Contains("top"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginRight | HolePunchMargins.MarginTop);
else if (lstPunchHoleMargins.Contains("right") && lstPunchHoleMargins.Contains("bottom"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginRight | HolePunchMargins.MarginBottom);
else if (lstPunchHoleMargins.Contains("top") && lstPunchHoleMargins.Contains("bottom"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginTop | HolePunchMargins.MarginBottom);
else if (lstPunchHoleMargins.Contains("left"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginLeft);
else if (lstPunchHoleMargins.Contains("right"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginRight);
else if (lstPunchHoleMargins.Contains("top"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginTop);
else if (lstPunchHoleMargins.Contains("bottom"))
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginBottom);
else
status = m_GdPictureImaging.RemoveHolePunch(m_ImageID, HolePunchMargins.MarginLeft | HolePunchMargins.MarginRight | HolePunchMargins.MarginTop | HolePunchMargins.MarginBottom);
}
if (this.tifToPDFConfig.AutoRotateEnabled)
{
int intPageRotation = m_GdPictureImaging.OCRTesseractGetOrientation(m_ImageID, this.tifToPDFConfig.Language, this.tifToPDFConfig.DictionaryPath);
if (intPageRotation != 0)
status = m_GdPictureImaging.RotateAngle(m_ImageID, 360 - intPageRotation);
}
if (this.tifToPDFConfig.DeleteBlanks == true)
{
if (m_GdPictureImaging.IsBlank(m_ImageID, float.Parse(this.tifToPDFConfig.DeleteBlanksBlankPixelThresholdInt), true))
{
if (this.tifToPDFConfig.DeleteBlanksAllOddEven.ToLower() == "all" ||
(this.tifToPDFConfig.DeleteBlanksAllOddEven.ToLower() == "odd" && (i % 2 != 0)) ||
(this.tifToPDFConfig.DeleteBlanksAllOddEven.ToLower() == "even" && (i % 2 == 0)))
{
// GdPicture11
//status = m_GdPictureImaging.TiffDeletePage(m_ImageID, i);
// GdPicture14
NumberofDeletedPages++;
}
else
{
// GdPicture11
//status = m_GdPictureImaging.TiffAddToMultiPageFile(m_ImageID, i, tiffcompression);
// GdPicture14
_PDFtoOCR.AddImageFromGdPictureImage(m_ImageID, false, true);
}
}
else
{
// GdPicture11
//status = m_GdPictureImaging.TiffAddToMultiPageFile(m_ImageID, i, tiffcompression);
// GdPicture14
_PDFtoOCR.AddImageFromGdPictureImage(m_ImageID, false, true);
}
}
else
{
// GdPicture11
//status = m_GdPictureImaging.TiffAddToMultiPageFile(m_ImageID, i, tiffcompression);
// GdPicture14
_PDFtoOCR.AddImageFromGdPictureImage(m_ImageID, false, true);
}
}
m_GdPictureImaging.ReleaseGdPictureImage(m_ImageID);
// check if searchable PDF req
if (this.tifToPDFConfig.SearchablePDF)
{
// GdPicture11 code
// GdPicture14 code
}
else
{
// GdPicture11 code
// GdPicture14 code
}
int pageCount = 0;
using (GdPicturePDF gdPicturePDF = new GdPicturePDF())
{
gdPicturePDF.LoadFromFile(this.GetTempOutputFilename(file), true);
pageCount = gdPicturePDF.GetPageCount();
}
if (NumberOfPages != (pageCount + NumberofDeletedPages))
{
success = false;
}
}
else // is single page
{
#region Tiff is single
#endregion
}
if (!(status == GdPictureStatus.OK))
{
success = false;
}
_PDFtoOCR.CloseDocument();
m_GdPictureImaging.ReleaseGdPictureImage(m_ImageID);
m_GdPictureImaging.Dispose();
}
Re: IsBlank Method (GdPictureImaging)
Hi,
Could you attach the image that is badly recognized with the engine?
Kind regards,
Loïc
Could you attach the image that is badly recognized with the engine?
Kind regards,
Loïc
Re: IsBlank Method (GdPictureImaging)
Hi Loic,
I have attached the doc in my previous reply.
I attached it again here anyway.
Regards,
Tri Nguyen
I have attached the doc in my previous reply.
I attached it again here anyway.
Regards,
Tri Nguyen
- Attachments
-
- TestDoc.zip
- (1.99 MiB) Downloaded 430 times
Re: IsBlank Method (GdPictureImaging)
Hi,
The latest version correctly detects the provided image as non blank.
Kind regards,
Loïc
The latest version correctly detects the provided image as non blank.
Kind regards,
Loïc
Re: IsBlank Method (GdPictureImaging)
Thanks Loïc,
I will check out your latest release.
Kind Regards,
Tri
I will check out your latest release.
Kind Regards,
Tri
Re: IsBlank Method (GdPictureImaging)
Hi Loïc,
Firstly, I would like to thank GDPicture team for your recent release. The Blank page drop out is now much better.
Secondly, what we are trying to do is to print document ID on the back of every page (adjustable position) and expect the "account for margin" in IsBlank method will take care of it.
Could you please explain how the account for margin works in IsBlank(Int32,Single,Boolean,Boolean) Method:
public bool IsBlank(
int ImageID,
float Confidence,
bool AccountForMargins,
bool AccountForPunchHoles
)
I found a discussion regarding this. (viewtopic.php?t=4071)
If you need more info, please ask me.
Thank you for any help you can offer.
Tri
Firstly, I would like to thank GDPicture team for your recent release. The Blank page drop out is now much better.
Secondly, what we are trying to do is to print document ID on the back of every page (adjustable position) and expect the "account for margin" in IsBlank method will take care of it.
Could you please explain how the account for margin works in IsBlank(Int32,Single,Boolean,Boolean) Method:
public bool IsBlank(
int ImageID,
float Confidence,
bool AccountForMargins,
bool AccountForPunchHoles
)
I found a discussion regarding this. (viewtopic.php?t=4071)
If you need more info, please ask me.
Thank you for any help you can offer.
Tri
Who is online
Users browsing this forum: No registered users and 1 guest