Professional Web Applications Themes

Search replace using 2 lines for pattern - PERL Beginners

Hi, I have a big mail archive in which the From... top header has disappeared and has been replaced by an empty newline. I know it is above the first Received: header so I can locate it. The mail headers start like this: Received: from smtpout.mac.com ([204.179.120.85]) by lists.apple.com (8.11.6/8.11.6) with ESMTP id fB6BBGb27756 for <cocoa-devlists.apple.com>; Thu, 6 Dec 2001 03:11:16 -0800 (PST) Received: from smtp-relay02.mac.com (server-source-si02 [10.13.10.6]) by smtpout.mac.com (8.12.1/8.10.2/1.0) with ESMTP id fB6B59Jl026926 for <cocoa-devlists.apple.com>; Thu, 6 Dec 2001 03:05:09 -0800 (PST) etc... Usually, it should start like the following (notice the From line): From [email]mailinglistlists.apple.com[/email] Thu Dec ...

  1. #1

    Default Search replace using 2 lines for pattern

    Hi,

    I have a big mail archive in which the From... top header has disappeared
    and has been replaced by an empty newline. I know it is above the first
    Received: header so I can locate it.

    The mail headers start like this:

    Received: from smtpout.mac.com ([204.179.120.85]) by lists.apple.com
    (8.11.6/8.11.6) with ESMTP id fB6BBGb27756 for
    <cocoa-devlists.apple.com>; Thu, 6 Dec 2001 03:11:16 -0800 (PST)
    Received: from smtp-relay02.mac.com (server-source-si02 [10.13.10.6]) by
    smtpout.mac.com (8.12.1/8.10.2/1.0) with ESMTP id fB6B59Jl026926 for
    <cocoa-devlists.apple.com>; Thu, 6 Dec 2001 03:05:09 -0800 (PST)

    etc...

    Usually, it should start like the following (notice the From line):

    From [email]mailinglistlists.apple.com[/email] Thu Dec 6 03:11:16 2001
    Received: from smtpout.mac.com ([204.179.120.85]) by lists.apple.com
    (8.11.6/8.11.6) with ESMTP id fB6BBGb27756 for
    <mailinglistlists.apple.com>; Thu, 6 Dec 2001 03:11:16 -0800 (PST)
    Received: from smtp-relay02.mac.com (server-source-si02 [10.13.10.6]) by
    smtpout.mac.com (8.12.1/8.10.2/1.0) with ESMTP id fB6B59Jl026926 for
    <mailinglistlists.apple.com>; Thu, 6 Dec 2001 03:05:09 -0800 (PST)

    etc...

    Perl seems the way to go to add the From line. Unfortunately, I have tried
    many times with no success. I can easily change the "Received:" header to be
    "From [email]mailinglistlists.apple.com[/email]\nReceived:" but as there are more than one
    "Received:" header, they all get replaced, which is bad.

    I think I need a regex that would match "\n\nReceived:" and replace it with
    "\nFrom \nReceived:". It doesn't seem difficult but I am stuck.

    I hope someone can help me, I have tried to solve this for hours...

    TIA,

    Bertrand Mansion
    Mamasam

    Bertrand Mansion Guest

  2. #2

    Default RE: Search replace using 2 lines for pattern

    Bertrand Mansion <bmansionmamasam.com> wrote:
    :
    [snip]
    : I have tried many times with no success. I can easily change
    : the "Received:" header to be
    : "From [email]mailinglistlists.apple.com[/email]\nReceived:" but as there
    : are more than one "Received:" header, they all get replaced,
    : which is bad.
    :
    : I think I need a regex that would match "\n\nReceived:" and
    : replace it with "\nFrom \nReceived:". It doesn't seem
    : difficult but I am stuck.
    :
    : I hope someone can help me, I have tried to solve this for hours...


    Can you show us what you have? It would make solving this
    much easier.


    HTH,

    Charles K. Clarkson
    --
    Head Bottle Washer,
    Clarkson Energy Homes, Inc.
    Mobile Home Specialists
    254 968-8328

    Charles K. Clarkson Guest

  3. #3

    Default Re: Search replace using 2 lines for pattern

    <cclarksonhtcomp.net> wrote:
    > Bertrand Mansion <bmansionmamasam.com> wrote:
    > :
    > [snip]
    > : I have tried many times with no success. I can easily change
    > : the "Received:" header to be
    > : "From [email]mailinglistlists.apple.com[/email]\nReceived:" but as there
    > : are more than one "Received:" header, they all get replaced,
    > : which is bad.
    > :
    > : I think I need a regex that would match "\n\nReceived:" and
    > : replace it with "\nFrom \nReceived:". It doesn't seem
    > : difficult but I am stuck.
    > :
    > : I hope someone can help me, I have tried to solve this for hours...
    >
    >
    > Can you show us what you have? It would make solving this
    > much easier.
    Well, I don't have much, I am trying to do it from the command line:

    perl -pi -e "s/\n\nReceived:/\nFrom x\nReceived:/" file.txt

    The archive is 70Mb approx. I am testing on a smaller subset.
    This looks so simple and common that I am probably not taking the problem in
    the right way.

    Bertrand Mansion
    Mamasam

    Bertrand Mansion Guest

  4. #4

    Default RE: Search replace using 2 lines for pattern



    : -----Original Message-----
    : From: Bertrand Mansion [mailto:bmansionmamasam.com]
    : Sent: Sunday, January 18, 2004 7:56 AM
    : To: Charles K. Clarkson; [email]beginnersperl.org[/email]
    : Subject: Re: Search replace using 2 lines for pattern
    :
    :
    : <cclarksonhtcomp.net> wrote:
    :
    : > Bertrand Mansion <bmansionmamasam.com> wrote:
    : > :
    : > [snip]
    : > : I have tried many times with no success. I can easily change
    : > : the "Received:" header to be
    : > : "From [email]mailinglistlists.apple.com[/email]\nReceived:" but as there
    : > : are more than one "Received:" header, they all get replaced,
    : > : which is bad.
    : > :
    : > : I think I need a regex that would match "\n\nReceived:" and
    : > : replace it with "\nFrom \nReceived:". It doesn't seem
    : > : difficult but I am stuck.
    : > :
    : > : I hope someone can help me, I have tried to solve this
    : for hours...
    : >
    : >
    : > Can you show us what you have? It would make solving this
    : > much easier.
    :
    : Well, I don't have much, I am trying to do it from the command line:
    :
    : perl -pi -e "s/\n\nReceived:/\nFrom x\nReceived:/" file.txt
    :
    : The archive is 70Mb approx. I am testing on a smaller subset.
    : This looks so simple and common that I am probably not taking
    : the problem in
    : the right way.

    I have never understood why one-liners are so popular.
    Perhaps because I abandoned the command line for the mouse
    so many years ago.

    Where is the x coming from? I think you should solve
    that first, but your current problem is with '-p' which is
    similar to (ignoring -i for now):



    while (<>) {

    s/\n\nReceived:/\nFrom x\nReceived:/;

    } continue {

    print or die "-p destination: $!\n";

    }


    If you only need one replacement, you need to adjust this
    to stop after the first success.


    HTH,

    Charles K. Clarkson
    --
    Head Bottle Washer,
    Clarkson Energy Homes, Inc.
    Mobile Home Specialists
    254 968-8328




















    Charles K. Clarkson Guest

Similar Threads

  1. help with the replace(pattern, replace)
    By Cloudesk in forum Macromedia Flex General Discussion
    Replies: 0
    Last Post: April 24th, 03:22 PM
  2. Search and replace (super global replace)
    By johnweiffenbach@adobeforums.com in forum Adobe Acrobat Windows
    Replies: 1
    Last Post: April 8th, 08:56 AM
  3. search an replace
    By Rmck in forum PERL Beginners
    Replies: 4
    Last Post: January 22nd, 04:52 PM
  4. Search and replace pattern in a file
    By Perl in forum PERL Beginners
    Replies: 2
    Last Post: January 20th, 01:04 AM
  5. search replace
    By Pandey Rajeev-A19514 in forum PERL Beginners
    Replies: 1
    Last Post: September 6th, 04:19 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139