Regular Expression Problem

Ask a Question related to PERL Miscellaneous, Design and Development.

  1. #1

    Default Regular Expression Problem

    I am trying to parse a relatively simple SQL query with a regular
    expression. All is going well except for one issue I don't seem to be able
    to find a solution for. Handling optional parentheses and brackets. This
    seems like a back reference problem to me but I am not sure. Let me give a
    simple example.

    An table name might be enclosed in brackets or it might not. For the sake of
    simplification, lets assume that the pattern [A-Za-z_]+ is what we are
    looking for in a table name. A really simple solution to find table names
    that are bracket enclosed or not would as follows:

    /(\[[A-Za-z_]+\])|([A-Za-z_]+)/

    The problem is that the pattern for a table name is not in reality as simple
    as [A-Za-z_]+ . In fact it is quite long, and repeating it twice in the
    expression seems inefficient not to mention prone to bugs if I need to tweak
    it and don't get each side exactly the same. What I want to do is something
    like the following:

    /(\[?)[A-Za-z_]+\1/

    This almost works except of course that my backreference is looking for an
    opening bracket [ when I need it to be looking for a closing bracket ]. So
    here it the crux of my question. Is there any way to do something like
    this--to have a backreference that does some sort of fuzzy match? I have a
    similar issues with parentheses.

    Thanks for the help,
    Ken Baltrinic


    Kenneth Baltrinic Guest

  2. Similar Questions and Discussions

    1. regular expression problem ? and * characters
      Im writing a perl script now and this is part of the sricpt chomp = ($pattern = ARGV); for each(@thisarray) { if($_ =~ m/$pattern/i) {...
    2. Problem with regular expression
      I'm trying to use a regular expression to remove from the output of a call to ToString with an XML object, the text between "<?" and "?>" txtvar...
    3. php regular expression problem
      Hi, I've got just a small problem, it's probably not very complex but I'm not very experienced with regular expression stuff. I just need to...
    4. help on regular expression
      Hi, I need some help on regular expression... i have following in variable $total_count $total_count = "##I USBP 000001 10:38:09(000)...
    5. Regular Expression HELP!
      I'm stripping out the attributes in <TD> tags...but I want to strip out everything BUT the COLSPAN attribute. The following strips out all...
  3. #2

    Default Re: Regular Expression Problem

    On Wed, 02 Jul 2003 04:33:24 GMT,
    Kenneth Baltrinic <kenneth@baltrinic.com> wrote:
    > I am trying to parse a relatively simple SQL query with a regular
    > expression. All is going well except for one issue I don't seem to be able
    > to find a solution for. Handling optional parentheses and brackets. This
    > seems like a back reference problem to me but I am not sure. Let me give a
    > simple example.
    >
    > An table name might be enclosed in brackets or it might not. For the sake of
    > simplification, lets assume that the pattern [A-Za-z_]+ is what we are
    > looking for in a table name. A really simple solution to find table names
    > that are bracket enclosed or not would as follows:
    >
    > /(\[[A-Za-z_]+\])|([A-Za-z_]+)/
    >
    > The problem is that the pattern for a table name is not in reality as simple
    > as [A-Za-z_]+ . In fact it is quite long, and repeating it twice in the
    > expression seems inefficient not to mention prone to bugs if I need to tweak
    > it and don't get each side exactly the same. What I want to do is something
    > like the following:
    Why not something like:

    (\[$table_name_pattern])|($table_name_pattern)

    Of course if nested parentheses are needed things get more complicated, and
    moving to a non-regex solution is often easier. Something like
    Parse::RecDescent, for example.
    >
    > /(\[?)[A-Za-z_]+\1/
    >
    > This almost works except of course that my backreference is looking for an
    > opening bracket [ when I need it to be looking for a closing bracket ]. So
    > here it the crux of my question. Is there any way to do something like
    > this--to have a backreference that does some sort of fuzzy match? I have a
    > similar issues with parentheses.
    Can't help with that, sorry...

    --
    Sam Holden

    Sam Holden Guest

  4. #3

    Default Re: Regular Expression Problem

    > What I want to do is something like the following:
    >
    > /(\[?)[A-Za-z_]+\1/
    >
    > This almost works except of course that my backreference is looking
    > for an opening bracket [ when I need it to be looking for a closing
    > bracket ]. So here it the crux of my question. Is there any way to
    > do something like this--to have a backreference that does some sort
    > of fuzzy match? I have a similar issues with parentheses.
    >
    > Thanks for the help,
    > Ken Baltrinic
    Maybe a (?<= [ )] positive lookbehind is what you are looking for...

    I'm pretty new to this though, and if what you are showing is just a small
    part of the regex, it may not work.

    XC


    Chauncey Williams Guest

  5. #4

    Default Re: Regular Expression Problem

    Kenneth Baltrinic (kenneth@baltrinic.com) wrote on MMMDXCII September
    MCMXCIII in <URL:news:ogtMa.8784$Hw.6276188@news2.news.adelphi a.net>:
    $$ I am trying to parse a relatively simple SQL query with a regular
    $$ expression. All is going well except for one issue I don't seem to be able
    $$ to find a solution for. Handling optional parentheses and brackets. This
    $$ seems like a back reference problem to me but I am not sure. Let me give a
    $$ simple example.
    $$
    $$ An table name might be enclosed in brackets or it might not. For the sake of
    $$ simplification, lets assume that the pattern [A-Za-z_]+ is what we are
    $$ looking for in a table name. A really simple solution to find table names
    $$ that are bracket enclosed or not would as follows:
    $$
    $$ /(\[[A-Za-z_]+\])|([A-Za-z_]+)/
    $$
    $$ The problem is that the pattern for a table name is not in reality as simple
    $$ as [A-Za-z_]+ . In fact it is quite long, and repeating it twice in the
    $$ expression seems inefficient not to mention prone to bugs if I need to tweak
    $$ it and don't get each side exactly the same. What I want to do is something
    $$ like the following:
    $$
    $$ /(\[?)[A-Za-z_]+\1/
    $$
    $$ This almost works except of course that my backreference is looking for an
    $$ opening bracket [ when I need it to be looking for a closing bracket ]. So
    $$ here it the crux of my question. Is there any way to do something like
    $$ this--to have a backreference that does some sort of fuzzy match? I have a
    $$ similar issues with parentheses.

    Use the (??{ }) construct:

    /(\[?)[A-Za-z_]+(??{$1 ? "]" : ""})/



    Abigail
    --
    perl -MLWP::UserAgent -MHTML::TreeBuilder -MHTML::FormatText -wle'print +(
    HTML::FormatText -> new -> format (HTML::TreeBuilder -> new -> parse (
    LWP::UserAgent -> new -> request (HTTP::Request -> new ("GET",
    "http://work.ucsd.edu:5141/cgi-bin/http_webster?isindex=perl")) -> content))
    =~ /(.*\))[-\s]+Addition/s) [0]'
    Abigail Guest

  6. #5

    Default Regular Expression Problem

    Greatings

    I'm hoping that someone could help me a regular expression string. I have a
    javascript that checks a form text entry to verify that text entered meets a
    pre-defined criteria. The problem is, I can't figure out the proper string to
    use to check the form data entered. Please see code attached.

    I'm trying to create an expression that only allows the following pattern:
    numbers that range from 1 to 100 with an option astrick (*) after the number
    like 78*
    or the 2 letters 'in' (mixed case allowed, In, iN, IN)

    I just can't seem to the get the pattern correct. Any help would be
    appreciated. Thanks, JK

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
    <html>
    <head>
    <title>Test</title>
    <script language="JavaScript1.2">
    function check(field)
    {
    pattern1 = /^[^0]$|^[1]+[0]+[0]\x2A?$|^[1-9]\x2A?$|^[1-9][0-9]\x2A?$/;
    if(pattern1.test(field.value)==false)
    {
    alert("Please enter a number between 1-100 with an * after or IN. Changing
    to 1.");
    field.value = 1;
    }
    }
    </script>
    </head>
    <body>
    <form action="test.cfm" method="post">
    <input type="text" name="mark" size="4" maxlength="4"
    onblur="check(this);"><br>
    <input type="text" name="foo" size="4" maxlength="4"><br>
    <input type="submit" name="submit" value="Continue">
    </form>
    </body>
    </html>

    Krogman Guest

  7. #6

    Default Re: Regular Expression Problem

    I put some more though into this and I have found the solution:
    pattern1 =
    /^[^\x21\x22\x23\x24\x24\x25\x26\x27\x28\x29\x2A\x2B\ x2C\x2D\x2E\x2Fa-hj-mo-zA-H
    J-MO-Z]^[^0]*$|^[1]+[0]+[0]\x2A?$|^[1-9]\x2A?$|^[1-9][0-9]\x2A?$|^[n]$|^[N]$|^[n
    ]$|^[N]$/;
    I have included a function to change the form value if it is above 100.
    Thanks.

    function check(field,limit)
    {
    pattern1 =
    /^[^\x21\x22\x23\x24\x24\x25\x26\x27\x28\x29\x2A\x2B\ x2C\x2D\x2E\x2Fa-hj-mo-zA-H
    J-MO-Z]^[^0]*$|^[1]+[0]+[0]\x2A?$|^[1-9]\x2A?$|^[1-9][0-9]\x2A?$|^[i][n]$|^[I][N
    ]$|^[I][n]$|^[i][N]$/;
    if(pattern1.test(field.value)==false)
    {
    alert("Please enter a number between 1-100 with an * after or IN. Changing
    to 1.");
    field.value = 1;
    }
    else
    {
    if(field.value > limit)
    {
    alert("Please enter a number between 1-100 or IN. Changing to 1.");
    field.value = 1;
    }
    }
    }

    Krogman Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139