Professional Web Applications Themes

Editting text from OCR Conversion - Adobe Acrobat Windows

After OCRing pdf images, I can search in Adobe, but find that the OCR conversion wasn't truly accurate. Is there a way to edit the text that is written behind the pdf through Adobe Acrobat?...

  1. #1

    Default Editting text from OCR Conversion

    After OCRing pdf images, I can search in Adobe, but find that the OCR conversion wasn't truly accurate. Is there a way to edit the text that is written behind the pdf through Adobe Acrobat?
    wynn_williamson@adobeforums.com Guest

  2. #2

    Default Re: Editting text from OCR Conversion

    After running the OCR did you then check for capture suspects?
    Jonathan_H@adobeforums.com Guest

  3. #3

    Default Re: Editting text from OCR Conversion

    I did, but I am working w/ 1000+ page doents of poor quality. The problem I'm encountering is that when I convert the pdf through the OCR program (OmniPage Pro), I lose all of my bookmarks. After I have the converted pdf I can redo the bookmarks, but if I want to make any changes to the searchable text, I have to send it back through OmniPage and I lose my bookmarks again.
    wynn_williamson@adobeforums.com Guest

  4. #4

    Default Re: Editting text from OCR Conversion

    Why not use Acrobat's OCR?

    Aandi Inston
    Aandi_Inston@adobeforums.com Guest

  5. #5

    Default Re: Editting text from OCR Conversion

    Are you talking about Catapult? Yes, I see that Adobe makes a program. It appears to be that the main problem here is that Adobe doesn't want to cooperate w/ the other vendor's software (I have had a second problem that seems to confirm this). I am somewhat resentful about this and this will certainly affect my decision to use Adobe products in the future.
    wynn_williamson@adobeforums.com Guest

  6. #6

    Default Re: Editting text from OCR Conversion

    It is called Capture.

    There is a program called Adobe Capture which is essentially for converting bulk paper into PDF and running OCR.

    There is Paper Capture which is a plug-in for Adobe Acrobat - it is free and is included with Acrobat 6 and it can be downloaded for Acrobat 5.
    Jonathan_H@adobeforums.com Guest

  7. #7

    Default Re: Editting text from OCR Conversion

    I have Acrobat 6. I was not aware that the program had OCR capacities. In fact, I searched throughout the help extensively and could not find any help on OCR. I can convert the doent, but now I am having problems correcting OCR suspects. When I select "find first OCR suspect," it returns a message saying that "Capture Complete" w/o finding any errors for me to fix. There are errors, however. Do you have any suggestions? Thanks for all of the help!
    wynn_williamson@adobeforums.com Guest

  8. #8

    Default Re: Editting text from OCR Conversion

    I see that you have to convert to "formatted text and graphics," when doing the paper capture, but I cannot compromise the integrity of the original image to make it searchable...any suggestions?
    wynn_williamson@adobeforums.com Guest

  9. #9

    Default Re: Editting text from OCR Conversion



    , but I cannot compromise the integrity of the original image to make
    it searchable...any suggestions?




    My experience with OCR is somewhat limited, but I seem to recall that (at least for some OCR software) there is an option to create an "Image with hidden text" type of file. That is, the "look" of your image would not be changed, but the text would be OCR'd. Some poor soul who has had more experience with OCR might be able to elaborate.
    Fr._Watson@adobeforums.com Guest

  10. #10

    Default Re: Editting text from OCR Conversion

    You have three options

    Searchable Image (Exact)
    Original Image visible but actual text hidden

    Searchable Image (Compact)
    As above but with compression whic reduces file size but also image quality.

    Formatted Text and Graphics
    As it says really.
    Jonathan_H@adobeforums.com Guest

  11. #11

    Default Re: Editting text from OCR Conversion

    Fr. Watson is right - that is what I want to accomplish. With Omnipage pro I can create a pdf image over text. However, I cannot seem to edit this text w/i Adobe. I work in a law firm and i cannot alter the appearance of the doent for legal reasons. I need to be able to search through the pdfs, but I want to search text behind the doent. Adobe paper capture does a really poor job w/ doent of lower image quality. I need to be able to alter the text created by Omnipage or at least be able to edit the Adobe text from paper capture w/o altering the image.
    wynn_williamson@adobeforums.com Guest

  12. #12

    Default Re: Editting text from OCR Conversion

    >I see that you have to convert to "formatted text and graphics," when doing the paper capture

    Why?


    Aandi Inston
    Aandi_Inston@adobeforums.com Guest

  13. #13

    Default Re: Editting text from OCR Conversion

    I'm not really sure what you mean by that... I need to convert the pdf to searchable text. If I choose "searchable image (exact)" then I can't make edits to the text after Adobe converts to OCR. If I choose "formattable text and images" then I am stuck with the Adobe distortions of the text.
    wynn_williamson@adobeforums.com Guest

  14. #14

    Default Re: Editting text from OCR Conversion



    I'm not really sure what you mean by that... I need to convert the pdf
    to searchable text. If I choose "searchable image (exact)" then I can't
    make edits to the text after Adobe converts to OCR. If I choose "formattable
    text and images" then I am stuck with the Adobe distortions of the text.




    I wonder if there is a way that you could, say, export the OCR'd text to an RTF or text file, make the necessary changes, & then somehow 'slip it behind' the PDF image. I don't have enough experience to know if this is feasible. If I had to try, I think I would save/export the PDF as a text or RTF file, then open this in another program, and then make the changes necessary to the text. Once this was done, I would, if possible, make all the text white, overlay the text with images of the PDF doent, then re-pdf the edited file. I think that this would work, but I have not done it.
    Kamilyon_Bambiraptor@adobeforums.com Guest

  15. #15

    Default Re: Editting text from OCR Conversion

    You can't edit the text of a searchable image pdf as the text is 'hidden' under the 'image' in the pdf. You can only choose to capture suspects although I'm not sure if this will work with pdf files OCr-ed outside Acrobat.
    de_Siem@adobeforums.com Guest

  16. #16

    Default Re: Editting text from OCR Conversion



    You can't edit the text of a searchable image pdf as the text is 'hidden'
    under the 'image' in the pdf.




    I guess then the only thing might be to first do an ordinary OCR (i.e., convert to text only & no image), make corrections in that, then make that white, overlay an image, etc. Would that possibly work? It's labourious & not ideal, but might it be a workaround.
    Kamilyon_Bambiraptor@adobeforums.com Guest

  17. #17

    Default Re: Editting text from OCR Conversion

    Hi Wynn,
    Have you found a solution to the problem of editing the 'hidden text'? You are not on your own in coming across this frustrating problem after doing an 'Image' PDF to 'Image + hidden text' PDF conversion. I have also had the same experience trying to capture 'suspects' - in that no 'suspects' are found. I suspect this feature only works when scanning an original paper version rather than doing a file conversion.

    It is a frustrating problem because the searchable text is obviously accessible. It can be copied and pasted into other applications - but not made visible and corrected within Acrobat.

    I am trying to find out if the full version of Capture allows conversion of PDF's and editing of text.

    There must be thousands of users with this problem.

    Regards

    David.
    David_C_Rowland@adobeforums.com Guest

  18. #18

    Default Re: Editting text from OCR Conversion

    You can edit the text that is hidden behind the printed text, but the process is rather bersome.

    Use the Touchup Text Tool to select the line of text that you wish to edit. Select all the text that is in the box. Right click and select 'Attributes'. There is a little box in the bottom left hand side of the Attributes dialog. Click this box and a colour palette will appear. Select a colour different from the printed one. For example, if the printed text is black, select Red. You will then be able to see the hidden searchable text and make the changes you require.
    Andrew_E_D_Clark@adobeforums.com Guest

  19. #19

    Default Editting text from OCR Conversion

    I came accross a really powerfull tool, I hope it's going to solve your problem,

    Aspose.OCR for Java is a Java optical character recognition component that allows developers to add OCR functionality in their Java web applications, web services and Windows applications. It provides a simple set of classes for controlling character recognition tasks. It helps developers to work with image files from within their Java applications. It allows developers to extract text from images, Read font, style information quickly, saving time & effort involved in developing an OCR solution from scratch.

    Many thanks
    sherazam Guest

Similar Threads

  1. PDF to Text conversion issues
    By Jay_Shakir@adobeforums.com in forum Adobe Acrobat SDK
    Replies: 26
    Last Post: October 16th, 01:03 PM
  2. Multiple PDF, PPT, DOC to html or text conversion
    By osiceanu in forum Adobe Acrobat SDK
    Replies: 0
    Last Post: February 21st, 10:18 AM
  3. Text conversion problem autocad to pdf
    By John_Molloy@adobeforums.com in forum Adobe Acrobat Windows
    Replies: 0
    Last Post: April 22nd, 08:47 AM
  4. Film Editting
    By Rus Foster in forum Debian
    Replies: 2
    Last Post: August 1st, 01:30 PM
  5. Conversion of text case
    By Steve Dondley in forum Macromedia Dreamweaver
    Replies: 0
    Last Post: July 14th, 09:14 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139