HTML to XML in Perl?

Ask a Question related to Perl / CGI, Design and Development.

  1. #1

    Default HTML to XML in Perl?

    Suppose I want to translate an HTML to a XML-well-formed HTML (so that
    I can, e.g., apply xsltproc to the result). E.g., HTML::TreeBuilder
    can apply "usual heuristics" to parse HTML; how to get XML out of it?

    Thanks,
    Ilya
    Ilya Zakharevich Guest

  2. Similar Questions and Discussions

    1. Quick Perl, HTML, CSS, JavaScript reference
      Found this site with lot's of help for different technologies. Find and click on Perl on technology list. http://www.gotapi.com Im wondering if...
    2. rendering html from perl
      On Friday, Nov 14, 2003, at 02:29 US/Pacific, john@shortstay-london.com wrote: From: drieux Sent: 15 November 2003 17:13 I'm not sure I...
    3. executing perl scripts from php/html
      On Thu, 28 Aug 2003 01:57:49 -0500, in message <3f4da7e9$0$6524$afc38c87@news.optusnet.com.au>, the AI program named "Leandra"...
    4. Perl-CGI: Return HTML and then a file problem
      Hi I run a couple of Perl-CGIs under the latest Apache on my Red Hat box. I have now come across a problem. From my script, I'd first like to...
    5. how to send html email from perl
      I just had the same question, and found that the following works. If you're using the "sendmail" package (see...
  3. #2

    Default Re: HTML to XML in Perl?

    Ilya Zakharevich <nospam-abuse@ilyaz.org> wrote:
    > Suppose I want to translate an HTML to a XML-well-formed HTML (so that
    > I can, e.g., apply xsltproc to the result). E.g., HTML::TreeBuilder
    > can apply "usual heuristics" to parse HTML; how to get XML out of it?
    Question: I use XML, not XHTML, at home, and use XML::Twig to convert it
    to HTML. I can use xsltproc if I want to on the XML file.

    You might want to traverse the parse tree HTML::TreeBuilder generates.
    Also, not 100% sure, but it might me that HTML tidy can do the XHTML
    conversion for you:

    Google...

    "Validator fixes errors in HTML and XHTML. Converts HTML to XHTML. Free
    Software."

    [url]http://www.google.com/search?q=html%20tidy%20xhtml[/url]

    Sounds like it does :-D.

    --
    John Bokma Freelance software developer
    &
    Experienced Perl programmer: [url]http://castleamber.com/[/url]
    John Bokma Guest

  4. #3

    Default Re: HTML to XML in Perl?

    Quote Originally Posted by Ilya Zakharevich View Post
    Suppose I want to translate an HTML to a XML-well-formed HTML (so that
    I can, e.g., apply xsltproc to the result). E.g., HTML::TreeBuilder
    can apply "usual heuristics" to parse HTML; how to get XML out of it?
    HTML::TreeBuilder has an as_XML method that does a pretty good job at getting proper XML from any HTML. It's not 100% reliable, but it mostly works.

    I found HTML::TreeBuilder to be more useful for converting HTML to XML than XML::LibXML, which does not parse a lot of crappy HTML, and than HTML::Tidy, which is (was?) a pain to use.
    mirod is offline Junior Member
    Join Date
    Aug 2010
    Posts
    1

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139