Professional Web Applications Themes

HOWTO: File Renaming and Directory Recursion - PERL Beginners

Earlier this morning, a friend of mine asked me for a script that would "given a list of files, replace underscores with spaces", to take this: Artist_Name-Track_Name.mp3 and rename it to this: Artist Name-Track Name.mp3 The script was mindlessly simple, and I felt it would be a good HOWTO for the perl beginners crowd, if not to show some good code practices, but also to counteract the . .. .. . "controversial" HOWTO that had been posted a week or so ago. Certainly, if you find this as misguided as his, complain onlist with better examples, or offlist with anger. ...

  1. #1

    Default HOWTO: File Renaming and Directory Recursion


    Earlier this morning, a friend of mine asked me for a script that would
    "given a list of files, replace underscores with spaces", to take this:

    Artist_Name-Track_Name.mp3

    and rename it to this:

    Artist Name-Track Name.mp3

    The script was mindlessly simple, and I felt it would be a good HOWTO
    for the perl beginners crowd, if not to show some good code practices,
    but also to counteract the . .. .. . "controversial" HOWTO that had
    been posted a week or so ago. Certainly, if you find this as misguided
    as his, complain onlist with better examples, or offlist with anger.

    The first bit of code I wrote him was below. Save for
    the additional explanatory comments, it's nearly exact.

    #!/usr/bin/perl

    # start all your scripts with these two lines.
    # they are the best teacher you will ever find for
    # writing perl code. they make you smarter, and
    # you'll last longer in bed; no drugs necessary.
    #
    use warnings;
    use strict;

    # he needed the script to read every file in a directory
    # and rename them based on whether they had underscores
    # in the name. to make the script as 'immediately runnable'
    # as possible, I assumed he would be placing the script
    # in the directory full of files, and running it there.
    # as such, we'll be opening the current working directory.
    # if anything goes wrong, we stop processing with the error.
    # ALWAYS CHECK FOR SUCCESS BEFORE CONTINUING.
    #
    opendir(DIR, ".") or die $!;

    # next, we load all the files in that directory into
    # an array. at this point, we could also have used the
    # grep() function to filter out entries that weren't
    # relevant, but for readability I chose a more configurable
    # approach (see below).
    #
    my files = readdir(DIR);
    close(DIR); # implicit.

    # now, we need to loop through all the directory files,
    # stored in array. we're going to use $_ here, which
    # can be remembered as "the thing we want to work with".
    # we could just as easily used an explicit variable name,
    # and in larger scripts, you usually want to.
    #
    while (files) {

    # so, the filename is now stored in $_, and since $_ is
    # assumed for a number of Perl's functions, we don't
    # have to explicitly mention it in the following filters.
    # these filter serve one purpose: make sure we're working
    # ONLY with files we should be. these are sanity checks:
    # we're ensuring that we're not operating on anything we
    # wouldn't. this is good practice: ALWAYS CHECK YOUR SANITY.
    # rule out everything you don't want, and focus on everything
    # you do. first, we skip directories with the -d check.
    # the syntax I'm using below is far more readable than a bunch
    # of if/else statements: we're not increasing our indents, and
    # we don't have to worry about a zillion open/closing brackets.
    # it also reads more like English.
    #
    next if -d;

    # if we're still here, we've got a file. we'll automatically
    # skip files that begin with a "." as they're usually considered
    # "special", and renaming them can be a bad thing.
    #
    next if /^\./;

    # and finally, if the file doesn't have any underscores in
    # it, we can skip it immediately. again, this is for safety:
    # the rest of our code could operate on the file and rename
    # it with the same filename, but why waste that processing
    # power? it's just dirty. it's how bad things happen.
    #
    next unless /_/;

    # at this point, we're assuming that this is a file
    # we're supposed to be working on. so, we copy the file
    # name, do our "underscore for space" conversion, and
    # issue a rename() from the original name to the new.
    # again, if something goes wrong with the rename, we
    # die immediately. this is probably overly cautious.
    #
    my $new_name = $_;
    $new_name =~ s/_/ /g;
    rename($_, $new_name) or die $!;
    }

    And that's the script. For readability, no comments:

    #!/usr/bin/perl
    use warnings;
    use strict;

    opendir(DIR, ".") or die $!;
    my files = readdir(DIR);
    close(DIR);

    while (files) {
    next if -d;
    next if /^\./;
    next unless /_/;

    my $new_name = $_;
    $new_name =~ s/_/ /g;
    rename($_, $new_name) or die $!;
    }

    It worked fine for him, and we moved on. A few hours later, he
    asked for a recursive version, and whether that would be "hard
    to do". While I wasn't around to help him out, the weak solution
    was easy: just move the script into each new directory and run
    it again. But, there are two other solutions to this new request:
    the bad one, and the good one.

    The bad one is to assume the first script is perfect: it's not.
    It works if you're in the current directory, and the assumptions
    are that no recursion is necessary. A bad approach to the recursive
    problem is to start modifying the above script to manually support
    it: people think "hey, I got this working, recursion must be
    simple as pie, right?!". Usually, they'll end up with something
    like (pseudo non-working code follows):

    my DIRECTORIES = "start_directory"

    foreach DIRECTORY (DIRECTORIES) {
    get list of ENTRIES in DIRECTORY

    foreach ENTRY (ENTRIES) {
    if ENTRY is a DIRECTORY, add to DIRECTORIES
    }

    finished DIRECTORY; remove it from DIRECTORIES
    }

    And you know what? This approach *can* work, but you're reinventing
    the wheel: this is such a common problem ("how do I recurse through
    directories") that it has been mentioned in a zillion FAQs. But no
    one reads FAQs, and no one reads HOWTO, so we're gonna be ing
    gas for the rest of our lives.

    The proper solution to recursing directories is File::Find. It's
    included with every distribution of Perl, is quick and easy to use,
    and allows code that looks nearly exactly like our first example.
    it's also far more platform-agnostic that you'd ever expect your
    code need to be. The revised script:

    #!/usr/bin/perl
    use warnings;
    use strict;

    use File::Find;

    # we no longer have to read directories
    # ourselves: File::Find takes care of that
    # for us - we just define a subroutine for
    # what we want to do with what's been found.
    #
    find(\&underscores, ".");

    # and here is that subroutine. it's nearly exactly
    # the same as our previous code, only this time, we
    # move into the directory that contains a file to
    # be renamed. this is actually a quick hack because
    # I knew this wouldn't be production-code: a more proper
    # solution would be to stay where we are in the directory
    # structure, and give full paths to our rename(). this
    # would require the help of another module, File::Spec.
    # find out more with "perldoc File::Spec". it's handy.
    #
    sub underscores {
    next if -d $_;
    next if /^\./;
    next unless /_/;

    my $new_name = $_;
    $new_name =~ s/_/ /g;
    chdir($File::Find::dir);
    rename($_, $new_name) or die $!;
    }

    One of the best traits you can learn as a Perl programmer is
    mastering the use of the core modules, as well as how to find what
    you need on CPAN: a good metric ton of your code will look far cleaner,
    far easier to understand, and far more maintainable (and FAR more
    doented too!). Likewise, you'll get far more done, and with
    less "doh!" bugs. Try to understand that a good number of the problems
    you'll face in programming have been solved for you: it's just a
    matter of taking the time to find the answer instead of coding your
    own "solution" that really isn't.

    Yep.

    --
    Morbus Iff ( shower your women, i'm coming )
    Technical: http://www.oreillynet.com/pub/au/779
    Culture: http://www.disobey.com/ and http://www.gamegrene.com/
    icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
    Morbus Guest

  2. #2

    Default Re: HOWTO: File Renaming and Directory Recursion

    On Thu, Apr 01, 2004 at 07:05:51PM -0500, Morbus Iff wrote:
     

    [ snip ]
     

    Are you sure that's not:

    for (files) {

    ?
     

    Here's a third:

    $ rename 'y/_/ /' **/*(.)

    That's zsh globbing and rename that used to come with perl. rename is
    now part of debian, and google tells me it can be found at:

    http://www.hurontel.on.ca/~barryp/menu-mysql/music_rename-1.12c/rename

    Though I'll admit that that solution doesn't provide so many
    opportunities for learning Perl.

    --
    Paul Johnson - net
    http://www.pjcj.net
    Paul Guest

  3. #3

    Default Re: HOWTO: File Renaming and Directory Recursion

    On 4/1/2004 7:05 PM, Morbus Iff wrote:
     

    Nice work. Just two quick comments. 1) Above, the chdir is not neccesary
    because File::Find moves through the directory structure unless you
    specify no_chdir(?). 2) It might be instructive to follow up with one
    that handles the directory as a parameter; I agree it would have been
    distracting here, but as a follow-up it would be a good how-to for a
    common task (incl usage && pod also).

    Again, this is a great example of a good How-To.

    Regards,
    Randy.


    Randy Guest

  4. #4

    Default Re: HOWTO: File Renaming and Directory Recursion

    >> while (files) { 

    Yup, "for" is right. An error in my memory recall.

    --
    Morbus Iff ( evil is my sour flavor )
    Technical: http://www.oreillynet.com/pub/au/779
    Culture: http://www.disobey.com/ and http://www.gamegrene.com/
    icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
    Morbus Guest

  5. #5

    Default Randal's columns (was Re: HOWTO: File Renaming and Directory Recursion)

    >>>>> "Morbus" == Morbus Iff <com> writes:

    Morbus> The script was mindlessly simple, and I felt it would be a good HOWTO
    Morbus> for the perl beginners crowd, if not to show some good code practices,
    Morbus> but also to counteract the . .. .. . "controversial" HOWTO that had
    Morbus> been posted a week or so ago. Certainly, if you find this as misguided
    Morbus> as his, complain onlist with better examples, or offlist with anger.

    Not necessarily a better example, but it's always worth a quick
    google of "site:stonehenge.com $YOUR_PERL_KEYWORDS HERE". In this
    case, using "File::Find" and "rename", the first hit is:

    <http://www.stonehenge.com/merlyn/LinuxMag/col45.html>

    which I wrote fairly recently and covers nearly identical ground.

    Seriously, after 192 magazine articles, there's not *much* that isn't
    illustrated in that archive of

    http://www.stonehenge.com/merlyn/UnixReview/
    http://www.stonehenge.com/merlyn/WebTechniques/
    http://www.stonehenge.com/merlyn/LinuxMag/
    http://www.stonehenge.com/merlyn/PerlJournal/

    --
    Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
    <com> <URL:http://www.stonehenge.com/merlyn/>
    Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
    See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
    Randal Guest

Similar Threads

  1. Reading a directory and Renaming the file
    By James Parsons in forum PERL Beginners
    Replies: 1
    Last Post: October 30th, 07:51 PM
  2. need help renaming file
    By roy terrazas in forum ASP Components
    Replies: 3
    Last Post: July 31st, 04:46 AM
  3. renaming file
    By Natty Gur in forum ASP.NET General
    Replies: 0
    Last Post: July 30th, 03:13 AM
  4. File::Find is slower than using recursion!?
    By Steve Allan in forum PERL Miscellaneous
    Replies: 6
    Last Post: July 22nd, 04:24 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139