Professional Web Applications Themes

Need help comparing lines in two files - PERL Beginners

This very green newbie would like to compare two files, let's say File1 and File2. I want to put the difference from File2 only, into a new file, File3. For example: File1.txt oranges apples bananas File2.txt apples kiwi bananas The result I want for File3 is the new entry in File2, which is kiwi. (I don't care that oranges was in File1 and not File2.) I tried using a nested foreach loop structure, but I can't get that to work and I have a feeling using nested foreach's is not the way to go. I'm guessing somehow I should use ...

  1. #1

    Default Need help comparing lines in two files

    This very green newbie would like to compare two files, let's say File1
    and File2. I
    want to put the difference from File2 only, into a new file, File3.

    For example:

    File1.txt
    oranges
    apples
    bananas

    File2.txt
    apples
    kiwi
    bananas

    The result I want for File3 is the new entry in File2, which is kiwi. (I
    don't care that oranges was in File1 and not File2.)

    I tried using a nested foreach loop structure, but I can't get that to
    work and I have a feeling using nested foreach's is not the way to go.

    I'm guessing somehow I should use hashs, but I've never used a hash for
    anything and I don't really know how to use a hash. Can someone help ?

    Here's my feeble attempt:

    my $file1;
    my $file2;

    my file1 = qw(oranges apples bananas);
    my file2 = qw(apples kiwi bananas);

    foreach $file2 (file2){
    foreach $file2 (file2){
    #print "$mastervob $tempvob \n";
    if ($file2 eq $file1) {
    last; # I would like to go up to the
    toplevel "foreach" here, but I don't know how to do it
    } # and I'm not sure this would even
    work.
    else{
    print "$file2 \n";
    }
    }
    }
    Stuart Clemons Guest

  2. #2

    Default Re: Need help comparing lines in two files

    Lets say file 1 is:

    foo
    bar
    .... continues on for 100 lines

    And file 2 is:

    foo
    baz
    bar
    .... continues on exactly the same 100 lines as file 1

    Would file 2 be different from file 1 from line 2 and down? Or would it
    be different for line 2 and 3?

    Also, the keywords:

    next; Brings you to the next iteration in a loop
    last; leaves the loop

    Should help you iterate through a while loop (or empty loop)

    i.e.

    {
    # this is a loop, just two sets of brackets
    # put a last statement and it will leave.
    # put one of these in your for loops, or outside of your for loops.
    }

    Also you can get tricky by naming loops, i.e.:

    FOO:
    {
    print "foo";
    BAR:
    {
    last FOO;
    }
    # anything below here never executes
    print "bar";
    }

    Dan Anderson Guest

  3. #3

    Default Re: Need help comparing lines in two files

    One more thing, those loops I was telling you about, just using a pair
    of brackets, also keep their scope. It's a good way to clean up with
    yourself, i.e.

    my $foo = 40;
    {
    my $foo = 50;
    print $foo; # prints 50
    # garbage collector called on all declarations before here
    }
    print $foo; # prints 40

    Also:

    use strict;
    use warnings;

    Should ALWAYS be at that op of your scripts until you know enough Perl
    to know when to bend or break this rule.

    -Dan

    Dan Anderson Guest

  4. #4

    Default Re: Need help comparing lines in two files

    > This very green newbie would like to compare two files, let's say File1
    > and File2. I
    > want to put the difference from File2 only, into a new file, File3.
    I had a very simliar problem about a week ago, which James answerd here:

    [url]http://groups.google.com/groups?q=Perl+looping+(a+lot+of[/url])
    +files&hl=en&lr=&ie=UTF-8&selm=28A16704-4AD3-11D8-9A03-000A95BA45F8%
    40grayproductions.net&rnum=1

    or try google groups "perl looping through (a lot of) files"

    The only really difference is that I didnt want to compare one FILE2 to FILE1
    but 500.

    However, be carefull on your filesize:
    I settled reading one file into mem (as an array) and looping through the
    other ones using a while (<FILE2>) reading the 500 files line by line.
    > For example:
    >
    > File1.txt
    > oranges
    > apples
    > bananas
    >
    > File2.txt
    > apples
    > kiwi
    > bananas
    >
    > The result I want for File3 is the new entry in File2, which is kiwi. (I
    > don't care that oranges was in File1 and not File2.)
    >
    > I tried using a nested foreach loop structure, but I can't get that to
    > work and I have a feeling using nested foreach's is not the way to go.
    why not?
    > I'm guessing somehow I should use hashs, but I've never used a hash for
    > anything and I don't really know how to use a hash. Can someone help ?
    do you need to associate the contens of the line with a filename ore
    something? if not, use an array.
    > Here's my feeble attempt:
    >
    > my $file1;
    > my $file2;
    >
    > my file1 = qw(oranges apples bananas);
    > my file2 = qw(apples kiwi bananas);
    >
    As Dan showed:

    FILE2:
    > foreach $file2 (file2){
    > foreach $file2 (file2){
    you may want
    foreach my $file1(file1){
    here
    > #print "$mastervob $tempvob \n";
    > if ($file2 eq $file1) {
    > last; # I would like to go up to the
    > toplevel "foreach" here, but I don't know how to do it
    > } # and I'm not sure this would even
    > work.
    as Dan said:

    next FILE2;

    will do the job.

    > else{
    > print "$file2 \n";
    > }
    > }
    > }
    This doesent do what I assume you want: when you place the print in the inner
    loop.
    Just look at the link above.

    Hope thats a start, Wolf

    Wolf Blaum Guest

  5. #5

    Default Re: Need help comparing lines in two files


    On Jan 22, 2004, at 4:52 PM, [email]stuart_clemonsus.ibm.com[/email] wrote:
    > This very green newbie would like to compare two files, let's say File1
    > and File2. I
    > want to put the difference from File2 only, into a new file, File3.
    >
    > For example:
    >
    > File1.txt
    > oranges
    > apples
    > bananas
    >
    > File2.txt
    > apples
    > kiwi
    > bananas
    >
    > The result I want for File3 is the new entry in File2, which is kiwi.
    > (I
    > don't care that oranges was in File1 and not File2.)
    in theory then file2.txt could have been

    oranges
    apples
    kiwi
    banana

    what about
    apples
    kiwi
    wombat
    bananas

    you would want to have kiwi and wombat

    One strategy would be say:
    my file1 = qw(oranges apples bananas);
    my file2 = qw(apples kiwi bananas frodo bagins);

    my list = get_diff_list(\file1,\file2);

    print "we see $_\n" foreach(list);

    #------------------------
    #
    sub get_diff_list
    {
    my ($list1, $list2 ) = _;

    my %hash = map { $_ => 1 } $list1;

    my ret_list;
    foreach ($list2)
    {
    if ( $hash{$_} )
    {
    delete( $hash{$_} );
    } else {
    push(ret_list,$_);
    }
    }
    # if you wanted to have the remaining bits that were
    # in list1 and not in list2
    #push(ret_list,$_) foreach(keys(%hash));
    ret_list;

    } # end of get_diff_list

    or how about

    sub get_diff_list
    {
    my ($list1, $list2 ) = _;

    my %hash = map { $_ => 1 } $list1;
    grep { $_ if ( ! exists($hash{$_})) } $list2;
    }


    ciao
    drieux

    ---

    Drieux Guest

  6. #6

    Default Re: Need help comparing lines in two files

    Thank you Dan and Wolf ! With the suggested changes, my foreach loop
    script now works as I hoped it would. (My first script did have a typo,
    as you pointed out, though my logic was still wrong.) I'm glad to be able
    to set aside my study of hashes for another day. I needed to get this
    problem solved so that I can get some other work done.

    To correct the script, I added LABELS, used the next statement with a
    LABEL, and moved the $print statement out of the inner loop, and waalaa,
    it worked properly. I quickly went over the logic of the working program
    and it makes sense. It's funny how things seem so clear once they're
    solved !

    My files are actually probably only 30 to 35 lines each, so size isn't a
    problem. The real data I'm comparing has email addresses in them. File2
    will either match File1, or have new email addresses in them. I then do
    stuff with the new email addresses.

    Thanks again. I really appreciate the help.

    Here's the working program:

    use strict;
    use warnings;

    my $file1;
    my $file2;

    my file1 = qw(oranges apples bananas);
    my file2 = qw(apples kiwi bananas);

    FILE2: foreach $file2 (file2){
    FILE1: foreach $file1 (file1){
    if ("$file2" eq "$file1") {
    next FILE2;
    }
    }
    print "$file2 \n";
    }

    The output is "kiwi", which is exactly right.
    kiwi




    wolf blaum <wolf.blaumcharite.de>
    01/22/2004 08:38 PM

    To
    [email]stuart_clemonsus.ibm.com[/email], [email]beginnersperl.org[/email]
    cc

    Subject
    Re: Need help comparing lines in two files





    > This very green newbie would like to compare two files, let's say File1
    > and File2. I
    > want to put the difference from File2 only, into a new file, File3.
    I had a very simliar problem about a week ago, which James answerd here:

    [url]http://groups.google.com/groups?q=Perl+looping+(a+lot+of[/url])
    +files&hl=en&lr=&ie=UTF-8&selm=28A16704-4AD3-11D8-9A03-000A95BA45F8%
    40grayproductions.net&rnum=1

    or try google groups "perl looping through (a lot of) files"

    The only really difference is that I didnt want to compare one FILE2 to
    FILE1
    but 500.

    However, be carefull on your filesize:
    I settled reading one file into mem (as an array) and looping through the
    other ones using a while (<FILE2>) reading the 500 files line by line.
    > For example:
    >
    > File1.txt
    > oranges
    > apples
    > bananas
    >
    > File2.txt
    > apples
    > kiwi
    > bananas
    >
    > The result I want for File3 is the new entry in File2, which is kiwi. (I
    > don't care that oranges was in File1 and not File2.)
    >
    > I tried using a nested foreach loop structure, but I can't get that to
    > work and I have a feeling using nested foreach's is not the way to go.
    why not?
    > I'm guessing somehow I should use hashs, but I've never used a hash for
    > anything and I don't really know how to use a hash. Can someone help ?
    do you need to associate the contens of the line with a filename ore
    something? if not, use an array.
    > Here's my feeble attempt:
    >
    > my $file1;
    > my $file2;
    >
    > my file1 = qw(oranges apples bananas);
    > my file2 = qw(apples kiwi bananas);
    >
    As Dan showed:

    FILE2:
    > foreach $file2 (file2){
    > foreach $file2 (file2){
    you may want
    foreach my $file1(file1){
    here
    > #print "$mastervob $tempvob \n";
    > if ($file2 eq $file1) {
    > last; # I would like to go up to the
    > toplevel "foreach" here, but I don't know how to do it
    > } # and I'm not sure this would
    even
    > work.
    as Dan said:

    next FILE2;

    will do the job.

    > else{
    > print "$file2 \n";
    > }
    > }
    > }
    This doesent do what I assume you want: when you place the print in the
    inner
    loop.
    Just look at the link above.

    Hope thats a start, Wolf



    Stuart Clemons Guest

  7. #7

    Default RE: Need help comparing lines in two files


    Hi Stuart,

    Have a look on CPAN ([url]www.cpan.org[/url]) there are two wonderfull packages to do
    exactely what you are dreaming of :

    Algorithm::Diff
    Text::ParagraphDiff


    Have a nice day
    Michel



    -----Message d'origine-----
    De: Dan Anderson [mailto:danmathjunkies.com]
    Date: vendredi 23 janvier 2004 02:17
    : [email]stuart_clemonsus.ibm.com[/email]
    Cc: Perl Beginners
    Objet: Re: Need help comparing lines in two files


    Lets say file 1 is:

    foo
    bar
    ... continues on for 100 lines

    And file 2 is:

    foo
    baz
    bar
    ... continues on exactly the same 100 lines as file 1

    Would file 2 be different from file 1 from line 2 and down? Or wouldit
    be different for line 2 and 3?

    Also, the keywords:

    next; Brings you to the next iteration in a loop
    last; leaves the loop

    Should help you iterate through a while loop (or empty loop)

    i.e.

    {
    # this is a loop, just two sets of brackets
    # put a last statement and it will leave.
    # put one of these in your for loops, or outside of your for loops.
    }

    Also you can get tricky by naming loops, i.e.:

    FOO:
    {
    print "foo";
    BAR:
    {
    last FOO;
    }
    # anything below here never executes
    print "bar";
    }


    --
    To unsubscribe, e-mail: [email]beginners-unsubscribeperl.org[/email]
    For additional commands, e-mail: [email]beginners-helpperl.org[/email]
    <http://learn.perl.org/> <http://learn.perl.org/first-response>

    Eurospace Szarindar Guest

  8. #8

    Default RE: Need help comparing lines in two files

    Thanks Michael. I'll take a look at those modules and see if my Perl
    skills are sufficient to understand how to use them. Thanks again.




    EUROSPACE SZARINDAR <EUROSPACE.SZARINDARspace.eads.net>
    01/23/2004 02:47 AM

    To
    [email]stuart_clemonsus.ibm.com[/email]
    cc
    Perl Beginners <beginnersperl.org>
    Subject
    RE: Need help comparing lines in two files







    Hi Stuart,

    Have a look on CPAN ([url]www.cpan.org[/url]) there are two wonderfull packages to do
    exactely what you are dreaming of :

    Algorithm::Diff
    Text::ParagraphDiff


    Have a nice day
    Michel



    -----Message d'origine-----
    De: Dan Anderson [mailto:danmathjunkies.com]
    Date: vendredi 23 janvier 2004 02:17
    : [email]stuart_clemonsus.ibm.com[/email]
    Cc: Perl Beginners
    Objet: Re: Need help comparing lines in two files


    Lets say file 1 is:

    foo
    bar
    .... continues on for 100 lines

    And file 2 is:

    foo
    baz
    bar
    .... continues on exactly the same 100 lines as file 1

    Would file 2 be different from file 1 from line 2 and down? Or would it
    be different for line 2 and 3?

    Also, the keywords:

    next; Brings you to the next iteration in a loop
    last; leaves the loop

    Should help you iterate through a while loop (or empty loop)

    i.e.

    {
    # this is a loop, just two sets of brackets
    # put a last statement and it will leave.
    # put one of these in your for loops, or outside of your for loops.
    }

    Also you can get tricky by naming loops, i.e.:

    FOO:
    {
    print "foo";
    BAR:
    {
    last FOO;
    }
    # anything below here never executes
    print "bar";
    }


    --
    To unsubscribe, e-mail: [email]beginners-unsubscribeperl.org[/email]
    For additional commands, e-mail: [email]beginners-helpperl.org[/email]
    <http://learn.perl.org/> <http://learn.perl.org/first-response>



    Stuart Clemons Guest

Similar Threads

  1. extra lines at the top of .cfm files
    By ktpdx in forum Macromedia ColdFusion
    Replies: 1
    Last Post: March 24th, 07:19 PM
  2. [PHP] comparing xml files, removing some html tags
    By Jabber@Raditha.Com in forum PHP Development
    Replies: 0
    Last Post: August 21st, 04:21 AM
  3. comparing xml files, removing some html tags
    By Robert Mena in forum PHP Development
    Replies: 0
    Last Post: August 18th, 02:35 AM
  4. Comparing directories and files
    By Lee in forum Windows XP/2000/ME
    Replies: 0
    Last Post: July 19th, 10:04 PM
  5. Comparing two files..
    By darkname in forum PERL Miscellaneous
    Replies: 4
    Last Post: July 12th, 11:18 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139