Professional Web Applications Themes

matching hypenated strings - PERL Beginners

Here's some test code I'm working with: ## begin ## while ($name = <DATA>) { $name =~ /(\w*)\.*/; $name{$1}++; $name =~ /(\w+)/; print "$& \n"; } __DATA__ tibor.test.net mars.test.net moon-bx-r.test.net moon-bs-d.test.net moon-bt-321.test.net ## end ## This works for hostnames without hyphens, but when there is a hyphen in the name, everything after the hyphen is ignored. I've been trying things like $name =~ /[a-z]*\-*\-*/ with no luck. The data coming into the expression may or may not be fully qualified, so I can't just take everything to the left of .test.net, and the domain name may be different at times, ...

  1. #1

    Default matching hypenated strings

    Here's some test code I'm working with:

    ## begin ##

    while ($name = <DATA>) {
    $name =~ /(\w*)\.*/;
    $name{$1}++;
    $name =~ /(\w+)/;
    print "$& \n";
    }


    __DATA__
    tibor.test.net
    mars.test.net
    moon-bx-r.test.net
    moon-bs-d.test.net
    moon-bt-321.test.net

    ## end ##

    This works for hostnames without hyphens, but when there is a hyphen in the
    name, everything after the hyphen is ignored. I've been trying things like
    $name =~ /[a-z]*\-*\-*/ with no luck. The data coming into the expression
    may or may not be fully qualified, so I can't just take everything to the left
    of .test.net, and the domain name may be different at times, anyway.

    So what I'm left with finding an expression that will match any alphanumeric,
    with 0 or more embedded dashes. It sounds simple, but I can't seem to find
    it.

    What am I missing?


    Thanks,

    d



    --
    Please respond to the list... NOT to me directly.
    o _ _ _
    _o /\_ _ \\o (_)\__/o (_)
    _< \_ _>(_) (_)/<_ \_| \ _|/' \/
    (_)>(_) (_) (_) (_) (_)' _\o_
    [url]http://zapatopi.net/afdb.html[/url]






    Deb Guest

  2. #2

    Default Re: matching hypenated strings

    And the clouds parted, and deb said...
    >
    > ## begin ##
    >
    > while ($name = <DATA>) {
    > $name =~ /(\w*)\.*/;
    > $name{$1}++;
    > $name =~ /(\w+)/;
    > print "$& \n";
    > }
    >
    >
    > __DATA__
    > tibor.test.net
    > mars.test.net
    > moon-bx-r.test.net
    > moon-bs-d.test.net
    > moon-bt-321.test.net
    >
    > ## end ##
    >
    > This works for hostnames without hyphens, but when there is a hyphen in the
    > name, everything after the hyphen is ignored. I've been trying things like
    > $name =~ /[a-z]*\-*\-*/ with no luck. The data coming into the expression
    > may or may not be fully qualified, so I can't just take everything to the left
    > of .test.net, and the domain name may be different at times, anyway.
    >
    > So what I'm left with finding an expression that will match any alphanumeric,
    > with 0 or more embedded dashes. It sounds simple, but I can't seem to find
    > it.
    >
    > What am I missing?
    Two things:

    1) The regex you're looking for is likely /[-\w]+/, which says "match one
    or more dashes or word characters". This will slurp up everything up to
    the first non-word, non-dash character.

    2) You can probably simplify your script to

    ## begin ##

    while (<DATA>) {
    (print "$& \n" and $name{$&}++) if /[-\w+]+/;
    }

    __DATA__
    tibor.test.net
    mars.test.net
    moon-bx-r.test.net
    moon-bs-d.test.net
    moon-bt-321.test.net

    ## end ##

    Its output is :
    ksh$ ./mchname
    tibor
    mars
    moon-bx-r
    moon-bs-d
    moon-bt-321
    ksh$

    HTH-
    Brian



    /~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~\
    | Brian Gerard A child of five could understand |
    | First initial + 'lists' this! Fetch me a child of five. |
    | at technobrat dot com |
    \_________________________________________________ _____________________/
    Brian Gerard Guest

  3. #3

    Default Re: matching hypenated strings

    And the clouds parted, and Rob Dixon said...
    >
    > That's a misuse of 'and' Brian. It says that the hash element
    > should be incremented only if the print succeeds, which isn't
    > what was intended. If you mean a code block then use a code
    > block.
    Mea Culpa. You're absolutely correct. Thanks for the correction. :)


    /~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~\
    | Brian Gerard Crime does not pay... as well as politics. |
    | First initial + 'lists' |
    | at technobrat dot com |
    \_________________________________________________ _____________________/
    Brian Gerard Guest

  4. #4

    Default Re: matching hypenated strings

    Deb wrote:
    >
    > Here's some test code I'm working with:
    >
    > ## begin ##
    >
    > while ($name = <DATA>) {
    > $name =~ /(\w*)\.*/;
    > $name{$1}++;
    > $name =~ /(\w+)/;
    > print "$& \n";
    > }
    >
    >
    > __DATA__
    > tibor.test.net
    > mars.test.net
    > moon-bx-r.test.net
    > moon-bs-d.test.net
    > moon-bt-321.test.net
    >
    > ## end ##
    >
    > This works for hostnames without hyphens, but when there is a hyphen in the
    > name, everything after the hyphen is ignored. I've been trying things like
    > $name =~ /[a-z]*\-*\-*/ with no luck. The data coming into the expression
    > may or may not be fully qualified, so I can't just take everything to the left
    > of .test.net, and the domain name may be different at times, anyway.
    >
    > So what I'm left with finding an expression that will match any alphanumeric,
    > with 0 or more embedded dashes. It sounds simple, but I can't seem to find
    > it.
    >
    > What am I missing?
    Hi Deb.

    You're missing the hyphen from the character class. The \w class
    is the same as [0-9A-Za-z_], and what you need is all of those
    characters plus 'hyphen'.

    This seemed a good time to showcase the much-misunderstood and
    underused qr// construct. If we do this:

    my $w = qr/[\w-]/;

    then there is a new character class all on its own which you can use
    instead of \w in your regexes. Check out the program below.

    But I'm left wondering what you're trying to do with the lines.

    $name =~ /(\w*)\.*/;
    $name =~ /(\w+)/;

    which I can't fathom.

    HTH,

    Rob


    use strict;
    use warnings;

    my $w = qr/[\w-]/; # Word characters plus hyphen

    my %name;

    while (my $name = <DATA>) {
    $name =~ /($w*)/;
    $name{$1}++;
    print "$1\n";
    }

    __DATA__
    tibor.test.net
    mars.test.net
    moon-bx-r.test.net
    moon-bs-d.test.net
    moon-bt-321.test.net


    **OUTPUT

    tibor
    mars
    moon-bx-r
    moon-bs-d
    moon-bt-321



    Rob Dixon Guest

  5. #5

    Default Re: matching hypenated strings

    At 10:17:47, on 11.25.03:
    Cracks in my tinfoil beanie
    allowed Brian Gerard to seep these bits into my brain:,
    > And the clouds parted, and deb said...
    > > What am I missing?
    >
    > Two things:
    >
    > 1) The regex you're looking for is likely /[-\w]+/, which says "match one
    > or more dashes or word characters". This will slurp up everything up to
    > the first non-word, non-dash character.
    Gagh, it's so simple. I had tried /[\w-] w/o the
    post-appended "+" and of course it failed. So close...
    > 2) You can probably simplify your script to
    >
    > ## begin ##
    >
    > while (<DATA>) {
    > (print "$& \n" and $name{$&}++) if /[-\w+]+/;
    > }
    Thanks, this helps a lot. :-)

    deb
    Deb Guest

  6. #6

    Default Re: matching hypenated strings

    Brian Gerard wrote:
    >
    > And the clouds parted, and deb said...
    > >
    > > ## begin ##
    > >
    > > while ($name = <DATA>) {
    > > $name =~ /(\w*)\.*/;
    > > $name{$1}++;
    > > $name =~ /(\w+)/;
    > > print "$& \n";
    > > }
    > >
    > >
    > > __DATA__
    > > tibor.test.net
    > > mars.test.net
    > > moon-bx-r.test.net
    > > moon-bs-d.test.net
    > > moon-bt-321.test.net
    > >
    > > ## end ##
    > >
    > > This works for hostnames without hyphens, but when there is a hyphen in the
    > > name, everything after the hyphen is ignored. I've been trying things like
    > > $name =~ /[a-z]*\-*\-*/ with no luck. The data coming into the expression
    > > may or may not be fully qualified, so I can't just take everything to the left
    > > of .test.net, and the domain name may be different at times, anyway.
    > >
    > > So what I'm left with finding an expression that will match any alphanumeric,
    > > with 0 or more embedded dashes. It sounds simple, but I can't seem to find
    > > it.
    > >
    > > What am I missing?
    >
    > Two things:
    >
    > 1) The regex you're looking for is likely /[-\w]+/, which says "match one
    > or more dashes or word characters". This will slurp up everything up to
    > the first non-word, non-dash character.
    >
    > 2) You can probably simplify your script to
    >
    > ## begin ##
    >
    > while (<DATA>) {
    > (print "$& \n" and $name{$&}++) if /[-\w+]+/;
    > }
    That's a misuse of 'and' Brian. It says that the hash element
    should be incremented only if the print succeeds, which isn't
    what was intended. If you mean a code block then use a code
    block.

    Cheers,

    Rob


    Rob Dixon Guest

  7. #7

    Default Re: matching hypenated strings

    Hi Rob,

    At 18:44:10, on 11.25.03:
    Cracks in my tinfoil beanie
    allowed Rob Dixon to seep these bits into my brain:,
    > Deb wrote:
    > >
    > > What am I missing?
    >
    > Hi Deb.
    >
    > You're missing the hyphen from the character class. The \w class
    > is the same as [0-9A-Za-z_], and what you need is all of those
    > characters plus 'hyphen'.
    >
    > This seemed a good time to showcase the much-misunderstood and
    > underused qr// construct. If we do this:
    >
    > my $w = qr/[\w-]/;
    Learned something new - I was not aware of the qr// construct...
    >
    > But I'm left wondering what you're trying to do with the lines.
    >
    > $name =~ /(\w*)\.*/;
    > $name =~ /(\w+)/;
    >
    > which I can't fathom.
    ;-) No doubt. My apologies for being sloppy. It was left over
    from a previous test and shouldn't have been included in the
    codelet of my first mesg.
    > HTH,
    Yes!

    Thanks for sharing a new way to do this. Very nice.

    deb
    > use strict;
    > use warnings;
    >
    > my $w = qr/[\w-]/; # Word characters plus hyphen
    >
    > my %name;
    >
    > while (my $name = <DATA>) {
    > $name =~ /($w*)/;
    > $name{$1}++;
    > print "$1\n";
    > }
    >
    > __DATA__
    > tibor.test.net
    > mars.test.net
    > moon-bx-r.test.net
    > moon-bs-d.test.net
    > moon-bt-321.test.net
    >
    >
    > **OUTPUT
    >
    > tibor
    > mars
    > moon-bx-r
    > moon-bs-d
    > moon-bt-321
    --
    o _ _ _
    _o /\_ _ \\o (_)\__/o (_)
    _< \_ _>(_) (_)/<_ \_| \ _|/' \/
    (_)>(_) (_) (_) (_) (_)' _\o_
    [url]http://zapatopi.net/afdb.html[/url]






    Deb Guest

Similar Threads

  1. Extracting strings delimited by other strings
    By Scott Bass in forum PERL Modules
    Replies: 2
    Last Post: May 7th, 02:26 AM
  2. matching
    By Eric Walker in forum PERL Beginners
    Replies: 5
    Last Post: November 18th, 04:37 PM
  3. matching the pattern (strings)
    By MJS in forum PERL Beginners
    Replies: 0
    Last Post: September 28th, 07:04 PM
  4. Matching String
    By Pablo Fischer in forum PERL Beginners
    Replies: 1
    Last Post: August 24th, 05:25 PM
  5. Matching Photos?
    By David Lowrey in forum Adobe Photoshop Elements
    Replies: 1
    Last Post: June 27th, 04:05 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139