Ask a Question related to Adobe Acrobat Macintosh, Design and Development.
-
Miklos_Somogyi@adobeforums.com #1
OCR help
OSX 10.3.7, Acrobat std 6.0.4
I got an OCR program with my Epson scanner (FineReader). It works, sort of.
Getting sick of lots of editing I tried Acrobat's OCR:
Create pdf by scanning > ~75% quality > 400 dpi grayscale > Document > Paper capture >
Start > Formatted text and graphics (now: white letters on black background, why?) >
Save as plain text
I always get an empty text file. Saving as jpeg or pdf is ok but black background prevails.
All I want is an editable plain ascii file, I don't need paragraphs and any fancy thing.
What am I doing wrong?
Miklos
Miklos_Somogyi@adobeforums.com Guest
-
Phil Taz #2
Re: OCR help
IMHO, acrobat OCR is waaaay behind the times, I have Textbridge from 1996 and it still beats acrobat 10 to 1!
The most critical thing is having a suitable starting file, in textbridge, 400 dpi greyscale or bitmap is perfect, keeps columns and lines and very, very high accuracy over 95%. The best I have had from acrobat was about 70%......
Phil Taz Guest
-
Gene_Hutton@adobeforums.com #3
Re: OCR help
this white text on a black background seems to have historically been an issue with Epson scanners. Apparently their twain driver is poorly written for the Mac. The solution would be to scan it as a jpg and if it has a black background you need to go to Epson and not be wasting your time here. If it does not then open the jpg in Acrobat and this will create a pdf and then run Paper Capture to OCR the document. I hope this helps.
Gene_Hutton@adobeforums.com Guest
-
Miklos_Somogyi@adobeforums.com #4
Re: OCR help
Thanks Phil, but I haven't gotten even that far. I only have empty files.
Thanks Gene, I don't think that this black<-->white thing is due to Epson.
The change occurs when I invoke Paper Capture.
Otherwise things are all right.
Miklos
Miklos_Somogyi@adobeforums.com Guest
-
Miklos_Somogyi@adobeforums.com #5
Re: OCR help
Gents,
I've tried a different tack. I scanned the document in GraphicConverter, removed
non-text stuff, made lines absolutely horizontal, saved the file as jpeg.
Read the jpeg by Acrobat, did a paper capture. No black<-->white conversion,
the whole thing looked perfect. Just a few words were in bold, unexpectedly.
I saved as text and surprise-surprise: the bold stuff was not in there.
Still, better than FineReader. BTW, on the way out Acrobat crashed, but the file was saved by then. If one is grateful for small mercies, this is a viable way to ocr a document.
Miklos
Miklos_Somogyi@adobeforums.com Guest
-
Miklos_Somogyi@adobeforums.com #6
Re: OCR help
Gents, even better: I suspected that the bold stuff were suspect words, but
they all seemed ok, so that I went for "Find all", hoping that Acrobat will let me ok them
in one go. It did not.
Now I tried "Find 1st suspect" and did ok them one-by-one. 100% accuracy.
The method is tedious but faster than correcting every suspect/missing word by
hand.
Miklos
Miklos_Somogyi@adobeforums.com Guest



Reply With Quote

