OCR (optical character recognition) with scan documents stored in pdf format

Ask a Question related to Adobe Acrobat Windows, Design and Development.

  1. #1

    Default OCR (optical character recognition) with scan documents stored in pdf format

    We have converted a substantial number of drawings and documents into pdf format via several methods;

    1)large format flatbed scanner (scan to tif then convert to pdf)

    2)desktop scanner found on a multifunction printer/copier/scanner (scan to tif then convert to pdf)

    3)print to file using the pdf distiller (distill word, excel, autocad, etc) directly to pdf

    We found that it would be very useful to "find" or "search" for words and phrases, etc. using a search command like the one in the adobe 6.0 reader.

    does adobe provide a product or does a third party add-on exist that would provide this type of capability?

    thanks in advance for your support,

    ray
    ray_mayer@adobeforums.com Guest

  2. Similar Questions and Discussions

    1. Acro Scan v. Photoshop Scan (CS3)
      Peter, Acrobat is taking whatever is sent to it via the twain driver and wrapping the result in a pdf file. The twain driver could be sending a...
    2. Virus Scan for uploaded documents?
      Platform: ASP.Net 1.1/C# Web App ========================== Folks, I am trying to build a functionality to check any documents being uploaded...
    3. Index scan vs. Seq scan on timestamps
      On Tue, Dec 07, 2004 at 09:25:20AM +0100, Per Jensen wrote: CURRENT_TIMESTAMP is fixed to the time of transaction start, not session start; this...
    4. Saving InDesign documents in a previous format
      The graphic community is been so loyal to Adobe by buying their products and not using piracy but I don’t thing Adobe is been fair to us. Their last...
    5. Line spacing, file format recognition & effects
      Three problems that I'm having: 1] When I change the size of a piece of text, the leading jumps to 100+ (instead of 0,1, whatever) for the entire...
  3. #2

    Default Re: OCR (optical character recognition) with scan documents stored in pdf format

    Are you looking for something that will perform OCR on your image documents or are you looking for something that will build indexes to allow searching across large sets of your (already OCR-ed) documents?
    W_T_Allen@adobeforums.com Guest

  4. #3

    Default Re: OCR (optical character recognition) with scan documents stored in pdf format

    [email]ray_mayer@adobeforums.com[/email] wrote in message news:<3bb427c6.-1@webx.la2eafNXanI>...
    > We have converted a substantial number of drawings and documents into pdf format via several methods;
    >
    > does adobe provide a product or does a third party add-on exist that would provide this type of capability?
    >
    > thanks in advance for your support,
    >
    > ray
    Yes, Adobe has Acrobat Capture for this very purpose. Many third
    party softwares exist and do a great job. I prefer Finereader. Be
    prepared to review and rectify capture suspects/errors. It is
    tedious.
    Ravi
    Captain Guest

  5. #4

    Default Re: OCR (optical character recognition) with scan documents stored in pdf format

    presently our need is to find words in a single open document that has been scanned on a flat bed scanner that did not use any OCR software during the scan process.
    ray_mayer@adobeforums.com Guest

  6. #5

    Default Re: OCR (optical character recognition) with scan documents stored in pdf format

    Acrobat can do this, but for large numbers of documents or for large-format images you should look into getting Capture.
    W_T_Allen@adobeforums.com Guest

  7. #6

    Default Re: OCR (optical character recognition) with scan documents stored in pdf format

    thanks for the response.

    am presently looking into capture.

    how can i do this with my adobe 5.0 or adobe 6.0 reader?

    have tried the search tool which does not work on my scanned document.

    any thoughts?

    thanks
    ray_mayer@adobeforums.com Guest

  8. #7

    Default Re: OCR (optical character recognition) with scan documents stored in pdf format

    I think that anything that has been converted to PDF from Word, Excel, etc. using acrobat distiller can be easily searched using the Acrobat Search Function. You could also build an Acrobat Catalog of all your documents and then you would be able to search across all your documents for specific words or phrases. If you plan to create a catalog, may I suggest that you use Acrobat 5. I have found Acrobat 6's Catalog function to be highly inefficient - it tends to create very large index files which far exceed the aggregate size of the cataloged files.

    However, for the tif files converted to pdf, you are out of luck since these are just image files and there are no words to search. You could resolve this problem by having the documents OCRed (I use the OmniPage software to do that), or alternatively, you could enter certain key words in the document information page, but those would be the only words you will be able to search for.
    Simon_Gill@adobeforums.com Guest

  9. #8

    Default Re: OCR (optical character recognition) with scan documents stored in pdf format

    Yes, the PDF created from Word docs can already be searched, but the scanned images need to be OCRed, which is why I recommended Capture if there were going to be lots of documents. If it's not many, Acrobat already has a built-in OCR engine.

    Ray, you cannot do this with Reader of any version, you need Acrobat or Capture.
    W_T_Allen@adobeforums.com Guest

  10. #9

    Default Re: OCR (optical character recognition) with scan documents stored in pdf format

    [email]ray_mayer@adobeforums.com[/email] wrote in message news:<3bb427c6.1@webx.la2eafNXanI>...
    > presently our need is to find words in a single open document that has been scanned on a flat bed scanner that did not use any OCR software during the scan process.
    You cannot search for words till you cary out optical character
    recognition on the document. If OCR was not carried out at the time
    of scan you have to do so now. Hopefully, the scan parameters would
    support a good OCR processing.
    Ravi
    Captain Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139