Ask a Question related to Linux / Unix Administration, Design and Development.
-
Troy Piggins #1
[procmail] [sed] [awk] cleaning up mail headers
Not quite sure where to post this ...
I am trying to write a command line for procmail that takes the From:
header field from an email and cleans it up into just the email address.
Reason being that depending on the sender, the From: field could be in
any of the following formats :
First Last <user@domain.com>
"First Last" <user@domain.com>
[email]user@domain.com[/email]
[email]first.last@domain.com.au[/email]
and there could be others.
I want to use just the email address (eg [email]user@domain.com[/email] or possibly
[email]first.last@domain.com.au[/email]) to grep my ~/.aliases file to check if I am
getting mail from someone I trust (whitelist).
My .aliases file is in mutt's format :
alias nickname1 First1 Last1 <user1@domain1.com>
alias nickname2 First2 Last2 <user2@domain2.com>
....
At present I use the following rule, with list.white being just a
manually edited list of emails only :
:0:
* ? formail -x "From:" -x "From" -x "Sender:" \
| egrep -i -is -f $PROCMAILDIR/list.white
white
But (I guess) I want something like this :
:0:
* ? formail -x "From:" -x "From" -x "Sender:" | $CLEAN_FROM_SCRIPT \
| egrep -i -is -f ~/.aliases
white
and I envisaged $CLEAN_FROM_SCRIPT being a sed or awk commandline, but
could be bash script. I need help with that part of it.
Any suggestions?
Thanks.
--
T R O Y P I G G I N S
e : [email]troy@piggo.com[/email]
Troy Piggins Guest
-
#15841 [NoF]: CRLF to separate mail headers is incorrect
ID: 15841 Updated by: tony2001@php.net Reported By: rha at juggernaut dot com dot au Status: No Feedback Bug... -
How does one use PHP to parse mail headers?
Hello! I am very new to PHP (or any type of scripting language), and am trying to write a PHP script that will accept email headers posted into a... -
Missing Mail headers causing grief
I've been trying to send mail to a close relative at a university mail account, and as of the past week, getting it bounced back as "rejected." A... -
mail.app forwarding and extra X-headers
We have installed a spam filter on our main mail server, AIX. It inserts additional X- headers in the mail message so the user can pick and choose.... -
Copying headers in OSX Mail
v1r0b1k@despammed.com writes: If the original message is in HTML, Mail.app does not render it in plaintext no matter what you do. It has no... -
William Park #2
Re: [procmail] [sed] [awk] cleaning up mail headers
In <comp.mail.misc> Troy Piggins <troy@piggo.com> wrote:
1. How about> Not quite sure where to post this ...
>
> I am trying to write a command line for procmail that takes the From:
> header field from an email and cleans it up into just the email address.
> Reason being that depending on the sender, the From: field could be in
> any of the following formats :
>
> First Last <user@domain.com>
> "First Last" <user@domain.com>
> [email]user@domain.com[/email]
> [email]first.last@domain.com.au[/email]
>
> and there could be others.
> I want to use just the email address (eg [email]user@domain.com[/email] or possibly
> [email]first.last@domain.com.au[/email]) to grep my ~/.aliases file to check if I am
> getting mail from someone I trust (whitelist).
> My .aliases file is in mutt's format :
>
> alias nickname1 First1 Last1 <user1@domain1.com>
> alias nickname2 First2 Last2 <user2@domain2.com>
> ...
>
> At present I use the following rule, with list.white being just a
> manually edited list of emails only :
>
> :0:
> * ? formail -x "From:" -x "From" -x "Sender:" \
> | egrep -i -is -f $PROCMAILDIR/list.white
> white
>
> But (I guess) I want something like this :
>
> :0:
> * ? formail -x "From:" -x "From" -x "Sender:" | $CLEAN_FROM_SCRIPT \
> | egrep -i -is -f ~/.aliases
> white
>
> and I envisaged $CLEAN_FROM_SCRIPT being a sed or awk commandline, but
> could be bash script. I need help with that part of it.
> Any suggestions?
> Thanks.
[a-z0-9_.-]+@[a-z0-9_.-]+
as email address pattern?
2. You should search ~/.aliases for a particular email address, not the
other way around. That is, do
egrep 'user@domain.com' ~/.aliases
and not
echo 'user@domain.com' | egrep -f ~/.aliases
3. As last resort, you can parse ~/.aliases runtime into
ALIASES = '(user1@domain1.com|user2@domain2.com|...)'
to be used as condition in a form of
* ^From:.*(...|...|...)
--
William Park, Open Geometry Consulting, <opengeometry@yahoo.ca>
No, I will not fix your computer! I'll reformat your harddisk, though.
William Park Guest
-
Allodoxaphobia #3
Re: [procmail] [sed] [awk] cleaning up mail headers
On Sun, 06 Jun 2004 21:26:21 +1000, Troy Piggins hath writ:
I've just now been hacking in this area...> Not quite sure where to post this ...
>
> I am trying to write a command line for procmail that takes the From:
> header field from an email and cleans it up into just the email address.
> Reason being that depending on the sender, the From: field could be in
> any of the following formats :
>
> First Last <user@domain.com>
> "First Last" <user@domain.com>
> [email]user@domain.com[/email]
> [email]first.last@domain.com.au[/email]
>
> and there could be others.
> I want to use just the email address (eg [email]user@domain.com[/email] or possibly
> [email]first.last@domain.com.au[/email]) to grep my ~/.aliases file to check if I am
> getting mail from someone I trust (whitelist).
Try:
formail -rtzxTo:
as found in:
[url]http://www.xray.mpe.mpg.de/mailing-lists/procmail/1998-05/msg00315.html[/url]
HTH,
Jonesy
--
| Marvin L Jones | jonz | W3DHJ | OS/2
| Gunnison, Colorado | @ | Jonesy | linux __
| 7,703' -- 2,345m | frontier.net | DM68mn SK
Allodoxaphobia Guest
-
Alan Connor #4
Re: [procmail] [sed] [awk] cleaning up mail headers
On Sun, 06 Jun 2004 21:26:21 +1000, Troy Piggins <troy@piggo.com> wrote:
Hi Troy,>
>
> Not quite sure where to post this ...
>
> I am trying to write a command line for procmail that takes the From:
> header field from an email and cleans it up into just the email address.
> Reason being that depending on the sender, the From: field could be in
> any of the following formats :
>
> First Last <user@domain.com>
> "First Last" <user@domain.com>
> [email]user@domain.com[/email]
> [email]first.last@domain.com.au[/email]
>
> and there could be others.
> I want to use just the email address (eg [email]user@domain.com[/email] or possibly
> [email]first.last@domain.com.au[/email]) to grep my ~/.aliases file to check if I am
> getting mail from someone I trust (whitelist).
> My .aliases file is in mutt's format :
>
> alias nickname1 First1 Last1 <user1@domain1.com>
> alias nickname2 First2 Last2 <user2@domain2.com>
> ...
>
> At present I use the following rule, with list.white being just a
> manually edited list of emails only :
>
>:0:
> * ? formail -x "From:" -x "From" -x "Sender:" \
> | egrep -i -is -f $PROCMAILDIR/list.white
> white
>
> But (I guess) I want something like this :
>
>:0:
> * ? formail -x "From:" -x "From" -x "Sender:" | $CLEAN_FROM_SCRIPT \
> | egrep -i -is -f ~/.aliases
> white
>
> and I envisaged $CLEAN_FROM_SCRIPT being a sed or awk commandline, but
> could be bash script. I need help with that part of it.
> Any suggestions?
> Thanks.
The nutters are out in force today, aren't they :-)
(I use passlist/blocklist instead of whitelist/blacklist, because the
latter sound racist to me.)
Couple of thoughts:
An effective passlist contains a lot more than just one's trusted
friends. It should include *anyone* you've sent a mail to, and
you aren't going to create an alias for most of those.
They can also be more complex than simply a return address. Sometimes,
you will not know what address the mail will be returned from. Mail
to any large organization can be like that, where it will be handed
from department to department and may end up being returned from the
home of an employee there...
In that case, it is useful to passlist the Subject: line, and have
something like this at the top of the mail:
|The Subject of this mail is a password and needs to be included in
|any reply, unchanged except for Re: (one or more) and whitespace(s).
|Thank you.
For mailing lists, it is often the Return-Path: you passlist, or a
special header from the listserver.
That being said....
(I'm assuming here that you call fetchmail which then calls
procmail...)
It's a lot easier to parse your_alias_file than the incoming headers,
so do that before fetchmail does its thing, with a script:
#!/bin/bash
# /usr/local/bin/alparse
sed -e 's/^\(.*<\)\(.*\)\(>.*\)/\2|\\/' \
-e '$s/|\\//' -e 's/\./\\./g' \
/home/you/your_alias_file > /home/you/newalias
To make it all automatic, alias (shell alias) fetchmail like so:
alias fetchmail='alparse && fetchmail'
So whenever you call fetchmail, the first thing that happens is
that your_alias_file is converted to something procmail can deal
with, and you can add and subract from it without worrying about
having to do anything else. The old newalias will be overwritten
each time you retrieve your mail.
This goes at the top of your .procmailrc:
ALIAS=`cat /home/you/newalias`
then
:0:
* $ ^(From|From:|Sender:|Reply-To:|Return-Path:).*${ALIAS}
pass
AC
--
Pass-List -----> Block-List ----> Challenge-Response
The key to taking control of your mailbox. Design Parameters:
[url]http://tinyurl.com/2t5kp[/url] || [url]http://tinyurl.com/3c3ag[/url]
Challenge-Response links -- [url]http://tinyurl.com/yrfjb[/url]
Alan Connor Guest
-
Troy Piggins #5
Re: [procmail] [sed] [awk] cleaning up mail headers
Allodoxaphobia wrote:
Thanks, Jonesy. That formail line did not do what I wanted, but I> I've just now been hacking in this area...
>
> Try:
> formail -rtzxTo:
>
> as found in:
>
> [url]http://www.xray.mpe.mpg.de/mailing-lists/procmail/1998-05/msg00315.html[/url]
>
> HTH,
> Jonesy
followed that link you gave. The archived post from Nancy McGough *did*
contain a sed script along the lines of what I want. She was using it
to create a whitelist from her addressbook :
cat $HOME/Msgs/AddressBook* \
|fgrep "@" \
|sed -e "s/^.*[^A-Za-z0-9_.+-]\([A-Za-z0-9_.+-]*@\)/\1/" \
-e "s/\(@[A-Za-z0-9_.+-]*\)[^A-Za-z0-9_.+-].*$/\1/" \
|sort -fu \I thought I could add it into where I had mentioned $CLEAN_FROM_SCRIPT> $HOME/Procmail/whitelist.tmp
like so :
:0:
* ? formail -x "From:" -x "From" -x "Sender:" \
|sed -e "s/^.*[^A-Za-z0-9_.+-]\([A-Za-z0-9_.+-]*@\)/\1/" \
-e "s/\(@[A-Za-z0-9_.+-]*\)[^A-Za-z0-9_.+-].*$/\1/" \
| egrep -i -is -f $HOME/.aliases
white.test
Unfortunately this is not working (the mail falls through to one of my
other recipes). The log gives this error :
procmail: Executing " formail -x "From:" -x "From" -x "Sender:" | sed -e
"s/^.*[^A-Za-z0-9_.+-]\([A-Za-z0-9_.+-]*@\)/\1/" -e
"s/\(@[A-Za-z0-9_.+-]*\)[^A-Za-z0-9_.+-].*$/\1/" | egrep -i -is -f
$HOME/.aliases"
grep: Trailing backslash
procmail: Non-zero exitcode (2) from " formail -x "From:" -x "From" -x
"Sender:" | sed -e "s/^.*[^A-Za-z0-9_.+-]\([A-Za-z0-9_.+-]*@\)/\1/" -e
"s/\(@[A-Za-z0-9_.+-]*\)[^A-Za-z0-9_.+-].*$/\1/" | egrep -i -is -f
$HOME/.aliases"
procmail: No match on " formail -x "From:" -x "From" -x "Sender:" | sed
-e "s/^.*[^A-Za-z0-9_.+-]\([A-Za-z0-9_.+-]*@\)/\1/" -e
"s/\(@[A-Za-z0-9_.+-]*\)[^A-Za-z0-9_.+-].*$/\1/" | egrep -i -is -f
$HOME/.aliases"
When I run this on the commandline I get the clean address I want :
[troy@linus:~]$ echo "\"Troy Piggins\" <troy@piggo.com>" |sed -e
"s/^.*[^A-Za-z0-9_.+-]\([A-Za-z0-9_.+-]*@\)/\1/" -e
"s/\(@[A-Za-z0-9_.+-]*\)[^A-Za-z0-9_.+-].*$/\1/"
the output is :
[email]troy@piggo.com[/email]
which is exactly the result I want.
I reckon the grep trailing backslash is playing up, but all the sed
scripts are confusing me with all the //\\/\/\/ etc.
I am missing something, but can't see it.
--
T R O Y P I G G I N S
e : [email]troy@piggo.com[/email]
Troy Piggins Guest
-
Troy Piggins #6
Re: [procmail] [sed] [awk] cleaning up mail headers
Alan Connor wrote:
True - I am just using this as a starting point, then once I get that> Couple of thoughts:
>
> An effective passlist contains a lot more than just one's trusted
> friends. It should include *anyone* you've sent a mail to, and
> you aren't going to create an alias for most of those.
working I can build from there.
yep.> (I'm assuming here that you call fetchmail which then calls
> procmail...)
Fair enough. I was trying to avoid creating a new temp file every time> It's a lot easier to parse your_alias_file than the incoming headers,
> so do that before fetchmail does its thing, with a script:
>
> #!/bin/bash
> # /usr/local/bin/alparse
>
> sed -e 's/^\(.*<\)\(.*\)\(>.*\)/\2|\\/' \
> -e '$s/|\\//' -e 's/\./\\./g' \
> /home/you/your_alias_file > /home/you/newalias
>
> To make it all automatic, alias (shell alias) fetchmail like so:
>
> alias fetchmail='alparse && fetchmail'
>
> So whenever you call fetchmail, the first thing that happens is
> that your_alias_file is converted to something procmail can deal
> with, and you can add and subract from it without worrying about
> having to do anything else. The old newalias will be overwritten
> each time you retrieve your mail.
I check mail. Would have thought this is unnecessary load with reading,
parsing, writing files every time fetchmail called.
That is why I was trying to use sed inline in the recipe.
See my response to Allodox's post for my attempt at solution.
--> This goes at the top of your .procmailrc:
>
> ALIAS=`cat /home/you/newalias`
>
> then
>
> :0:
> * $ ^(From|From:|Sender:|Reply-To:|Return-Path:).*${ALIAS}
> pass
>
> AC
T R O Y P I G G I N S
e : [email]troy@piggo.com[/email]
Troy Piggins Guest
-
Alan Connor #7
Re: [procmail] [sed] [awk] cleaning up mail headers
On Mon, 07 Jun 2004 06:05:58 GMT, Troy Piggins <troy@piggo.com> wrote:
I have run into the same sort of problem, many times, which is why I>
>
> Allodoxaphobia wrote:
>>>> I've just now been hacking in this area...
>>
>> Try:
>> formail -rtzxTo:
>>
>> as found in:
>>
>> [url]http://www.xray.mpe.mpg.de/mailing-lists/procmail/1998-05/msg00315.html[/url]
>>
>> HTH,
>> Jonesy
> Thanks, Jonesy. That formail line did not do what I wanted, but I
> followed that link you gave. The archived post from Nancy McGough *did*
> contain a sed script along the lines of what I want. She was using it
> to create a whitelist from her addressbook :
>
> cat $HOME/Msgs/AddressBook* \
> |fgrep "@" \
> |sed -e "s/^.*[^A-Za-z0-9_.+-]\([A-Za-z0-9_.+-]*@\)/\1/" \
> -e "s/\(@[A-Za-z0-9_.+-]*\)[^A-Za-z0-9_.+-].*$/\1/" \
> |sort -fu \>> > $HOME/Procmail/whitelist.tmp
> I thought I could add it into where I had mentioned $CLEAN_FROM_SCRIPT
> like so :
>
>:0:
> * ? formail -x "From:" -x "From" -x "Sender:" \
> |sed -e "s/^.*[^A-Za-z0-9_.+-]\([A-Za-z0-9_.+-]*@\)/\1/" \
> -e "s/\(@[A-Za-z0-9_.+-]*\)[^A-Za-z0-9_.+-].*$/\1/" \
> | egrep -i -is -f $HOME/.aliases
> white.test
>
> Unfortunately this is not working (the mail falls through to one of my
> other recipes). The log gives this error :
>
> procmail: Executing " formail -x "From:" -x "From" -x "Sender:" | sed -e
> "s/^.*[^A-Za-z0-9_.+-]\([A-Za-z0-9_.+-]*@\)/\1/" -e
> "s/\(@[A-Za-z0-9_.+-]*\)[^A-Za-z0-9_.+-].*$/\1/" | egrep -i -is -f
> $HOME/.aliases"
> grep: Trailing backslash
> procmail: Non-zero exitcode (2) from " formail -x "From:" -x "From" -x
> "Sender:" | sed -e "s/^.*[^A-Za-z0-9_.+-]\([A-Za-z0-9_.+-]*@\)/\1/" -e
> "s/\(@[A-Za-z0-9_.+-]*\)[^A-Za-z0-9_.+-].*$/\1/" | egrep -i -is -f
> $HOME/.aliases"
> procmail: No match on " formail -x "From:" -x "From" -x "Sender:" | sed
> -e "s/^.*[^A-Za-z0-9_.+-]\([A-Za-z0-9_.+-]*@\)/\1/" -e
> "s/\(@[A-Za-z0-9_.+-]*\)[^A-Za-z0-9_.+-].*$/\1/" | egrep -i -is -f
> $HOME/.aliases"
>
> When I run this on the commandline I get the clean address I want :
>
> [troy@linus:~]$ echo "\"Troy Piggins\" <troy@piggo.com>" |sed -e
> "s/^.*[^A-Za-z0-9_.+-]\([A-Za-z0-9_.+-]*@\)/\1/" -e
> "s/\(@[A-Za-z0-9_.+-]*\)[^A-Za-z0-9_.+-].*$/\1/"
>
> the output is :
>
> [email]troy@piggo.com[/email]
>
> which is exactly the result I want.
> I reckon the grep trailing backslash is playing up, but all the sed
> scripts are confusing me with all the //\\/\/\/ etc.
> I am missing something, but can't see it.
use the solution I presented in my post. Procmail can be very finnicky
and has trouble with complex scripts in the rc file itself.
It's usually better to put the script elsewhere and call it or pipe
through it from the procmailrc.
A log entry that reads: "program failure of script3" is a lot easier
to interpret than that mess above, and you know exactly where the
problem is.
On another level, you have to allow for a lot of variation in the
contents of those headers, sedscripting for all of them, and it is
much simpler to let procmail egrep them for a string.
Keep us posted. :-)
AC
--
Pass-List -----> Block-List ----> Challenge-Response
The key to taking control of your mailbox. Design Parameters:
[url]http://tinyurl.com/2t5kp[/url] || [url]http://tinyurl.com/3c3ag[/url]
Challenge-Response links -- [url]http://tinyurl.com/yrfjb[/url]
Alan Connor Guest



Reply With Quote

