filtering html tags from email

Ask a Question related to FreeBSD, Design and Development.

  1. #1

    Default filtering html tags from email

    Without going through the hassle of setting up proxy servers,
    isn't there a way that one can filter out html tags from a
    message (say, pipe the email through the filter from kmail for
    instance?)

    Perhaps I'm looking too hard for it, but I didn't see anything in
    the ports tree except for /mail/nohtml. I tried to pipe a html
    message through nohtml.py from kmail, but doesn't seem to work
    (although I'm getting no errors from kmail's filter log).

    Any ideas? Thx.


    Mike
    Mike Hauber Guest

  2. Similar Questions and Discussions

    1. html character enities and other html tags in Contribute3.x
      If you need support/integration of additional html character entities and other html tags in Contribute 3.x, please let Macromedia know that you are...
    2. Problem with html tags
      I need html tags in my forum pages to show up, but not to be active. I have tried to change the tags to html equivalent code using php, but still...
    3. supported HTML tags
      Hi everybody, Have been busy lately writing a tutorial on the use of HTML in Flash. Fr that I've been trying to get a complete overview of the...
    4. How to use HTML::Parser to remove HTML tags and print result
      I am trying to use HTML::Parser to parse an HTML file, remove all HTML tags (including comments, etc.), replace all ENTITIES (e.g. &amp), and put...
    5. Q. on a PHP based Email Address Obfuscator for website Mailto: tags
      I found a PHP based email obfuscator for use in websites, where the source HTML (PHP, actually) file doesn't contain the actual email address in ...
  3. #2

    Default Re: filtering html tags from email

    On 02/22/05 11:16 PM, Mike Hauber sat at the `puter and typed:
    > Without going through the hassle of setting up proxy servers,
    > isn't there a way that one can filter out html tags from a
    > message (say, pipe the email through the filter from kmail for
    > instance?)
    >
    > Perhaps I'm looking too hard for it, but I didn't see anything in
    > the ports tree except for /mail/nohtml. I tried to pipe a html
    > message through nohtml.py from kmail, but doesn't seem to work
    > (although I'm getting no errors from kmail's filter log).
    >
    > Any ideas? Thx.
    Mutt saves to a temp file then calls the following command:
    lynx -localhost -dump %s
    where '%s' is the temporary file you saved it to.

    You could also just pipe it to the following:
    lynx -localhost -dump -stdin

    the -localhost argument prevents lynx from simply following links
    external to your machine - helpful to avoid generating hits for
    unscrupulous spammers that get paid for hits on a URL.

    Just make sure lynx is installed.

    Lou
    --
    Louis LeBlanc FreeBSD-at-keyslapper-DOT-net
    Fully Funded Hobbyist, KeySlapper Extrordinaire :)
    Please send off-list email to: leblanc at keyslapper d.t net
    Key fingerprint = C5E7 4762 F071 CE3B ED51 4FB8 AF85 A2FE 80C8 D9A2

    Habit is habit, and not to be flung out of the window by any man, but
    coaxed down-stairs a step at a time.
    -- Mark Twain, "Pudd'nhead Wilson's Calendar

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.0 (FreeBSD)

    iD8DBQFCHBmar4Wi/oDI2aIRAvr4AJwITqWixImbYC5QF1Z3Xq0uMBerEQCdECML
    IysuV0pdvn3or+4weMeenwo=
    =CcYr
    -----END PGP SIGNATURE-----

    Louis LeBlanc Guest

  4. #3

    Default Re: filtering HTML tags from email

    On Wednesday 23 February 2005 12:50 am, Louis LeBlanc wrote:
    > On 02/22/05 11:16 PM, Mike Hauber sat at the `puter and typed:
    > > Without going through the hassle of setting up proxy servers,
    > > isn't there a way that one can filter out html tags from a
    > > message (say, pipe the email through the filter from kmail
    > > for instance?)
    > >
    > > Perhaps I'm looking too hard for it, but I didn't see
    > > anything in the ports tree except for /mail/nohtml. I tried
    > > to pipe a html message through nohtml.py from kmail, but
    > > doesn't seem to work (although I'm getting no errors from
    > > kmail's filter log).
    > >
    > > Any ideas? Thx.
    >
    > Mutt saves to a temp file then calls the following command:
    > lynx -localhost -dump %s
    > where '%s' is the temporary file you saved it to.
    >
    > You could also just pipe it to the following:
    > lynx -localhost -dump -stdin
    >
    > the -localhost argument prevents lynx from simply following
    > links external to your machine - helpful to avoid generating
    > hits for unscrupulous spammers that get paid for hits on a URL.
    >
    > Just make sure lynx is installed.
    >
    > Lou
    Okay, so to be sure, there is no filter (as of yet) to simply open
    an email file, strip the HTML tags, and resave it? I'm not
    complaining, as this may actually be something I'm capable of
    creating myself. (I'll make this my first python project. :) )

    I'm just making sure I'm not missing anything obvious before I
    start working on it. It's irritating to spend time on something
    only to find out that it's already been done.

    Thanks,

    Mike

    Mike Hauber Guest

  5. #4

    Default Re: filtering HTML tags from email

    Mike Hauber wrote:
    > > Mutt saves to a temp file then calls the following command:
    > > lynx -localhost -dump %s
    > > where '%s' is the temporary file you saved it to.
    > >
    > > You could also just pipe it to the following:
    > > lynx -localhost -dump -stdin
    > >
    > > the -localhost argument prevents lynx from simply following
    > > links external to your machine - helpful to avoid generating
    > > hits for unscrupulous spammers that get paid for hits on a URL.
    > >
    > > Just make sure lynx is installed.
    > >
    > > Lou
    >
    > Okay, so to be sure, there is no filter (as of yet) to simply open
    > an email file, strip the HTML tags, and resave it? I'm not
    > complaining, as this may actually be something I'm capable of
    > creating myself. (I'll make this my first python project. :) )
    >
    > I'm just making sure I'm not missing anything obvious before I
    > start working on it. It's irritating to spend time on something
    > only to find out that it's already been done.
    You probably could do it also with procmail + lynx (or w3m) during the
    delivery process.

    Another possibility is to have the following entries in your ~/.mailcap
    file, which converts html, doc and rtf to plain text.

    text/html; w3m -dump -T text/html; copiousoutput;
    application/msword; antiword %s; copiousoutput
    application/rtf; rtfreader %s; copiousoutput

    As for your python script: I don't think that just stripping everything
    matching the following expressions is correct because they might appear
    in non html emails, too: <.*> <\/.*> (perl syntax).

    At least, you'd need a list of valid html tags, i.e. a regular grammar
    for html: <b> | </b> | <i> | </i> | ... (BNF notation).

    While this is not too hard to implement (and possibly a good project to
    learn a new programming language), this would be too much work for
    something that can be achieved easier with existing tools (that is, for
    me, personally ;-)

    Simon

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.0 (FreeBSD)

    iD8DBQFCHFA0Ckn+/eutqCoRAgNVAJ9Y/2R6ycf+xgexeEVLUH5XxcwrnwCgxfM8
    lNOVsHQxYbxw3Y9Qa7cwJlI=
    =y8Uh
    -----END PGP SIGNATURE-----

    Simon Barner Guest

  6. #5

    Default Re: filtering HTML tags from email

    On Wednesday 23 February 2005 04:43 am, Simon Barner wrote:
    > > > You could also just pipe it to the following:
    > > > lynx -localhost -dump -stdin
    > > >
    > > > Lou
    > >
    > > Okay, so to be sure, there is no filter (as of yet) to simply
    > > open an email file, strip the HTML tags, and resave it? I'm
    > > not complaining, as this may actually be something I'm
    > > capable of creating myself. (I'll make this my first python
    > > project. :) )
    > >
    >
    > You probably could do it also with procmail + lynx (or w3m)
    > during the delivery process.
    >
    > Another possibility is to have the following entries in your
    > ~/.mailcap file, which converts html, doc and rtf to plain
    > text.
    >
    > text/html; w3m -dump -T text/html; copiousoutput;
    > application/msword; antiword %s; copiousoutput
    > application/rtf; rtfreader %s; copiousoutput
    >
    > Simon
    Just after destroying the headers in who-knows-how-many emails
    (backed up... whew!), I finally realized that piping the
    messages though html2text (or lynx or w3m) was probably not such
    a great idea after all. :)

    This is something that really should be implemented as part of
    kmail itself (it would help to remain compatable with both
    maildir/mbox). I'll continue to be frustrated with html2text for
    a while (it's a pretty cool tool), and who knows... Mayhaps I'll
    figure out a reasonable way to set it up so that everything is
    done automatically.

    Thanks for the feeds.

    Mike
    Mike Hauber Guest

  7. #6

    Default RE: filtering HTML tags from email


    > -----Original Message-----
    > From: [email]owner-freebsd-questions@freebsd.org[/email]
    > [mailto:owner-freebsd-questions@freebsd.org]On Behalf Of Mike Hauber
    > Sent: Wednesday, February 23, 2005 4:19 AM
    > To: [email]freebsd-questions@freebsd.org[/email]
    > Subject: Re: filtering HTML tags from email
    >
    >
    > Just after destroying the headers in who-knows-how-many emails
    > (backed up... whew!), I finally realized that piping the
    > messages though html2text (or lynx or w3m) was probably not such
    > a great idea after all. :)
    >
    > This is something that really should be implemented as part of
    > kmail itself (it would help to remain compatable with both
    > maildir/mbox). I'll continue to be frustrated with html2text for
    > a while (it's a pretty cool tool), and who knows... Mayhaps I'll
    > figure out a reasonable way to set it up so that everything is
    > done automatically.
    Mike, why are you torturing yourself when [url]http://www.mimedefang.org/[/url]
    does this? Afraid of Sendmail?

    Ted
    Ted Mittelstaedt Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139