Professional Web Applications Themes

Escape regex harmful characters - PERL Beginners

Hello everyone, I found the subroutine below from a script that takes a string of text and builds a regular expression out of it and incorporates that into a rule for SpamAssassin. It's worked quite well for my needs, but the script has led to a few questions about the code and regex's. My questions are: 1) Are the characters escaped in the subroutine *all* the characters that *should* be escaped in a regex? 2) The author seems to think "this is crap code, there will be a better way to do this".....thoughts, comments, suggestion on other ways this can ...

  1. #1

    Default Escape regex harmful characters

    Hello everyone,

    I found the subroutine below from a script that takes a string of text
    and builds a regular expression out of it and incorporates that into a
    rule for SpamAssassin. It's worked quite well for my needs, but the
    script has led to a few questions about the code and regex's.

    My questions are:

    1) Are the characters escaped in the subroutine *all* the characters
    that *should* be escaped in a regex?

    2) The author seems to think "this is crap code, there will be a better
    way to do this".....thoughts, comments, suggestion on other ways this
    can be done?


    #escape stuff that would otherwise be regexp special
    #this is crap code, there will be a better way to do this
    sub escapebad {
    my $string = $_[0];
    $string =~ s/([\\\/\^\.\$\*\+\?\\{\}\[\]\(\)\<\>])/\\$&/g;
    #bad cha
    rs turned good
    return ($string);
    }

    Also anyone interested in the script for generating SpamAssasin rules,
    let me know and I'll send you a copy.

    Thanks for any help,
    Kevin
    --
    Kevin Old <koldkold.homelinux.com>

    Kevin Old Guest

  2. #2

    Default Re: Escape regex harmful characters

    On 12/30/2003 12:22 PM, Kevin Old wrote:
    > Hello everyone,
    >
    > I found the subroutine below from a script that takes a string of text
    > and builds a regular expression out of it and incorporates that into a
    > rule for SpamAssassin. It's worked quite well for my needs, but the
    > script has led to a few questions about the code and regex's.
    >
    > My questions are:
    >
    > 1) Are the characters escaped in the subroutine *all* the characters
    > that *should* be escaped in a regex?
    >
    > 2) The author seems to think "this is crap code, there will be a better
    > way to do this".....thoughts, comments, suggestion on other ways this
    > can be done?
    >
    >
    > #escape stuff that would otherwise be regexp special
    > #this is crap code, there will be a better way to do this
    > sub escapebad {
    > my $string = $_[0];
    > $string =~ s/([\\\/\^\.\$\*\+\?\\{\}\[\]\(\)\<\>])/\\$&/g;
    > #bad cha
    > rs turned good
    > return ($string);
    > }
    >
    > Also anyone interested in the script for generating SpamAssasin rules,
    > let me know and I'll send you a copy.
    >
    > Thanks for any help,
    > Kevin
    If you have a string that is going to need escaping, consider using
    /\Q$string\U/ to handle quoting regex special chars.

    Regards,
    Randy.


    Randy W. Sims Guest

  3. #3

    Default Re: Escape regex harmful characters

    On Dec 30, 2003, at 12:30 PM, Randy W. Sims wrote:
    > If you have a string that is going to need escaping, consider
    > using /\Q$string\U/ to handle quoting regex special chars.
    Right -- but that should be \E (for "end") instead of \U (the
    mnemonic for which is "uppercase", not "unquote").

    % perl -le 'print "\Q\Uyyy"'
    YYY

    --
    Steve

    Steve Grazzini Guest

  4. #4

    Default Re: Escape regex harmful characters

    Steve Grazzini wrote:
    > On Dec 30, 2003, at 12:30 PM, Randy W. Sims wrote:
    > > If you have a string that is going to need escaping, consider
    > > using /\Q$string\U/ to handle quoting regex special chars.
    >
    > Right -- but that should be \E (for "end") instead of \U (the
    > mnemonic for which is "uppercase", not "unquote").
    >
    > % perl -le 'print "\Q\Uyyy"'
    > YYY
    Which all makes a very good argument for the clearly named
    quotemeta:

    Greetings! E:\d_drive\perlStuff>perl -w
    print "What's on your mind?\n";
    my $input_line = <STDIN>;
    chomp $input_line;
    my $escaped_string = quotemeta $input_line;
    print "That's pretty bloody cryptic, y'know. Howzabout a complete
    sentence?\n";
    my $full_line = <STDIN>;
    if ($full_line =~ /$escaped_string/) {
    print"the escaped string worked\n";
    } else {
    print "escaping didn't help, dangitall!\n";
    }

    if ($full_line =~ /$input_line/) {
    print "The input line worked all on its lonely, anyway.\n";
    } else {
    print "Maybe the escaped one did it?\n";
    }

    ^Z
    What's on your mind?
    $5.00 to [email]joeeverywhere.com[/email]
    That's pretty bloody cryptic, y'know. Howzabout a complete
    sentence?
    I owe $5.00 to [email]joeeverywhere.com[/email], and I don't have it.
    the escaped string worked
    Maybe the escaped one did it?

    Joseph

    R. Joseph Newton Guest

Similar Threads

  1. Replies: 0
    Last Post: January 17th, 07:47 PM
  2. \' and \" (general escape characters)
    By Viruss in forum PHP Development
    Replies: 2
    Last Post: May 28th, 10:27 AM
  3. Regex to match ALL characters?
    By Linda Patterson in forum PHP Development
    Replies: 3
    Last Post: July 16th, 01:30 PM
  4. Escape sequence for unicode characters in NSString
    By Eric Raas in forum Mac Programming
    Replies: 2
    Last Post: July 9th, 02:03 AM
  5. Simple Regex question (~[^newline characters])
    By dk_sz in forum PHP Development
    Replies: 2
    Last Post: July 6th, 07:56 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139