Export data from pdf

Ask a Question related to Adobe Acrobat SDK, Design and Development.

  1. #1

    Default Export data from pdf

    I'm writing a simple plugin to export some data from a pdf file. I'm looking for a series of three digits on a document. I was able to create the plugin with little effort, but it works for 99% of the labels, but some of them are missing from the output file.

    My technique is to iterate through all the elements of the pages looking for elements of type kPDEText, but for some elements it does not work and it seems that some label indeed are not saved as kPDEText.

    you can find a sample of the pdf at
    <http://www.nablasoft.com/temp/esploso1.pdf>
    it contains only the elements that the plugin is missing.

    I based my code on the basic plugin of the sdk, if you want to look at my code you can find the whole solution here. (it is in 7zip format)

    <http://www.nablasoft.com/temp/BasicPlugin.7z>

    Someone have any ideas?

    Alk.
    gianmaria_ricci@adobeforums.com Guest

  2. Similar Questions and Discussions

    1. Export data to PDF from .NET?
      hi clive, maybe these to links can help you: http://itextsharp.sourceforge.net http://report.sourceforge.net/ i use the itextsharp dll., but...
    2. how can i export data to excel?
      1,how can i export data to excel? is there direct or indirect ways ? i just assume that i convert the data to a xml string then copy to the...
    3. export data to Excel
      Hi experts, I retrieve data from the database and display on ASP, then I export these data to a file, like Excel (the best) or text file. Is it...
    4. Export data
      check out http://www.greggriffiths.org/webdev/both/excel/ Clive Moss wrote:
    5. Not able to Export data from data file
      Hi, I exported data from a table that has column with datatype as 'nvarchar' with length 31 and has NOT NULL constraint with collation as...
  3. #2

    Default Re: Export data from pdf

    I haven't looked at your code, but the most common error is forgetting to handle recursion into the various "containers" in the PDE layer.
    Leonard_Rosenthol@adobeforums.com Guest

  4. #3

    Default Re: Export data from pdf

    Ok, I've encountered problem of this type before, where can I find a list of the all the element that are logical "containers" in PDE layer?

    Thanks in advance.

    Gian Maria.
    gianmaria_ricci@adobeforums.com Guest

  5. #4

    Default Re: Export data from pdf

    I seem to recall that they have a property which is container - but you can look at the various samples in the SDK that demonstrate how to iterate over content.
    Leonard_Rosenthol@adobeforums.com Guest

  6. #5

    Default Re: Export data from pdf

    I have narrowed my problem, for some of the drawing I have, when I encounter a text I use PDETextGetText to get text. It seems that it gets data not in ASCII format but in some form of unicode.

    All strings are three digit and I found in text buffer sequencies of bytes like 0, 22, 0, 21, 0, 21 that seems to be like unicode, but if I treat them as UNICODE with wchar_t they contains only garbage charachters.

    Maybe there is a unicode version for PDETextGetText?

    Gian Maria
    gianmaria_ricci@adobeforums.com Guest

  7. #6

    Default Re: Export data from pdf

    Dear Leonard, I'm working in C, so I work with handle and does not have C++ classes to check a container property.

    I get elements with

    PDEElement element = PDEContentGetElement(content, index)

    and it gives me in return a PDEElement that is only an address. In my documents I found only kPDEPath, kPDEForms (and I iterate in the form content) kPDEPath and kPDeImages.

    Maybe I do the wrong thing to iterate into the form content?

    if (type == kPDEForm) {
    PDEForm form = (PDEForm) element;
    PDEContent content = PDEFormGetContent(form);

    and then iterate again in PDEContent elements with PDEContentGetNumElemens and PDEContentGetElem

    Gian Maria.
    gianmaria_ricci@adobeforums.com Guest

  8. #7

    Default Re: Export data from pdf

    PDEGetText is about getting the actual raw "codes" from the content stream - if you need something that equates to textual content, there are other methods.
    Leonard_Rosenthol@adobeforums.com Guest

  9. #8

    Default Re: Export data from pdf

    I understand, but still some text part is completely missing, I'll keep investigating. Can you tell me other methods to work with text in PDE layer?

    Thanks in advance for your help.

    Gian Maria
    gianmaria_ricci@adobeforums.com Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139