Hi Loïc,
I've found some information that may be helpful! I installed Process Monitor (
http://technet.microsoft.com/en-us/sysi ... 96645.aspx) onto my system so that I could watch for any File I/O errors while the Tesseract engine was running under IIS. I noticed this entry:
Code: Select all
Date & Time: 10/3/2009 12:28:48 PM
Event Class: File System
Operation: CreateFile
[b]Result: ACCESS DENIED[/b]
[b]Path: C:\Windows\System32\inetsrv\GdPicture.NET.ocr.tesseract.dll[/b]
TID: 57548
Duration: 0.0000770
Desired Access: Generic Read/Write
Disposition: OverwriteIf
Options: Synchronous IO Non-Alert, Non-Directory File, Open No Recall
Attributes: n/a
ShareMode: None
AllocationSize: 0
Description: IIS Worker Process
Company: Microsoft Corporation
Name: w3wp.exe
Version: 7.0.6001.18000
Path: c:\windows\system32\inetsrv\w3wp.exe
Command Line: c:\windows\system32\inetsrv\w3wp.exe -a \\.\pipe\iisipm0d1e1454-db72-4f11-ac10-d8da255c9d17 -v "v2.0" -h "C:\inetpub\temp\apppools\DefaultAppPool.config" -w "" -m 0 -t 20 -ap "DefaultAppPool"
PID: 54284
Parent PID: 3480
Session ID: 0
[b]User: NT AUTHORITY\NETWORK SERVICE[/b]
Auth ID: 00000000:000003e4
Architecture: 32-bit
Virtualized: False
Integrity: System
Started: 10/3/2009 12:24:05 PM
Ended: (Running)
Modules:
w3wp.exe
So, I gave the "Network Service" account access to the c:\windows\system32\inetsrv\ folder. This solved the problem. The ASP.net application was running under the "DefaultApplicationPool" in IIS which runs under the "Network Service" account. Makes sense.
But going forward, this will cause some problems. We have a large customer base that we're hoping to upgrade to use the Tesseract OCR Engine. Asking everyone to change their IIS settings is not appealing, most of these clients don't have dedicated IT staff. (They'd probably telephone us and complain! yikes! haha)
I've read some articles online (
http://sjc.ironspeed.com/post?id=2496571) that seem to indicate that temporary files are created in the c:\windows\system32\inetsrv\ when a .NET library writes files to the disk without specifying a full path. Is it possible that Tesseract is trying to write temporary files to the disk? Is it possible to provide Tesseract with a default "temp" folder? We have a temp folder within our system that does have the correct permissions (Network Service can access it). If we could specify a temp path, I think these issues would go away!
I do have a project that reproduces the problem. You can use the original sample that I provided in the first message of this thread (TesseractTest.zip). Just unzip the files to a folder, adjust the path to the OCR Libraries in default.aspx.vb if needed, and place copies of GdPicture.NET.dll, GdPicture.NET.image.gdimgplug.dll, GdPicture.NET.ocr.tesseract.dll and GdPicture.NET.pdf.gdpdfplug.dll into the "BIN" folder within the sample project. Then go into IIS, right-click "Default Web Site", choose "New"-> "Virtual Directory", a wizard will appear, enter the alias "TesseractTest" click "next", choose the path to the folder where you extracted the TesseractTest.zip (e.g. c:\Inetpub\wwwroot\TesseractTest\) click "next", then allow permissions for "Read" and "Run Scripts (Such as ASP)". Finally open a browser and visit
http://localhost/TesseractTest/, click the "Run OCR" button. You'll notice the PDF does not contain any text from the OCR engine.
Then, give the "Network Service" access to the "c:\windows\system32\inetsrv\" folder and try again. You should notice that the resulting PDF does contain the OCR text! So, it seems like we can isolate this issue. But there doesn't seem to be anything else that I can do given the interface to GDPicture.
If you have trouble setting this up, I can give you remote desktop access to one of our testing servers so that you can sign in and look at everything I've discussed. Just let me know if this would be helpful!
I must admit that I'm really impressed with your support! You guys do a great job! A big chunk of my day is involved with doing technical support. Always drives me nuts when users don't give enough info so that we can diagnose problems. I'm hoping this info will help!
If there is anything else you need, please let me know!
Have a great weekend!
Chris