How to use HTML::Parser to remove HTML tags and print result

Ask a Question related to PERL Modules, Design and Development.

  1. #1

    Default How to use HTML::Parser to remove HTML tags and print result

    I am trying to use HTML::Parser to parse an HTML file, remove all HTML tags
    (including comments, etc.), replace all ENTITIES (e.g. &amp), and put the
    result into a variable as a string. I figure HTML::Parser itself can
    somehow preform the filtering, but how do I get it back as a string? I'd
    appreciate some sample code if anyone has any. Sorry if this is a real n00b
    question.

    Thanks a lot,
    Mitchua



    Mitchua Guest

  2. Similar Questions and Discussions

    1. remove html tags in string
      I see that this cfset statement removes all HTML between the < > brackets. Is there a way to remove all HTML, except certain tags such as...
    2. HTML Parser
      Try something like this. This does send any form information, but only gets the html markup of the page. If you want to send form or query data that...
    3. html character enities and other html tags in Contribute3.x
      If you need support/integration of additional html character entities and other html tags in Contribute 3.x, please let Macromedia know that you are...
    4. HTML-Parser-3.35
      "Brad Olin" <bwo@bwo1.com> wrote in message news:qvmg001be5ge2qs9o23561il50urj0lcb5@4ax.com... Very puzzling. Are you sure you don't have LWP...
    5. HTML-Parser / SGML-Parser
      Ok, silly question. I am writing a script to determine my router's WAN ip address and then to email me once an hour in case it changes. Currently...
  3. #2

    Default Re: How to use HTML::Parser to remove HTML tags and print result


    "Mitchua" <mitchua@yahoo.com> wrote in message
    news:pvmQa.23629$lJd1.21048@news01.bloor.is.net.ca ble.rogers.com...
    > I am trying to use HTML::Parser to parse an HTML file, remove all HTML
    tags
    > (including comments, etc.), replace all ENTITIES (e.g. &amp), and put the
    > result into a variable as a string. I figure HTML::Parser itself can
    > somehow preform the filtering, but how do I get it back as a string? I'd
    > appreciate some sample code if anyone has any. Sorry if this is a real
    n00b
    > question.
    >
    > Thanks a lot,
    > Mitchua
    Try this for a sample of parsing a webpage
    [url]http://www.wdvl.com/Authoring/Languages/Perl/PerlfortheWeb/summarizer.html[/url]
    If you are just trying to remove all the html tags, you could just do this
    $webpage =~ s/<.*?>//g;

    Ice Demon
    [url]http://adult-xxx-newsgroups.com[/url]
    [url]http://adult-cybergames.com[/url]
    [url]http://adult-spider.com[/url]


    Ice Demon Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139