Regular expression bug?

Ask a Question related to Coldfusion - Advanced Techniques, Design and Development.

  1. #1

    Default Regular expression bug?

    All of CF's RE functions act in a weird way, contrary to the documentation
    (both CF's own, and the underlying Java Regex docs). The special characters ^
    and $ are defined as: 'Boundary matchers ^ The beginning of a line $ The end
    of a line' But, CF will only match the beginning or end of a STRING, not a
    line within that string. Couple this with the fact that the 'Whitespace
    Management' option in the administrator has no effect on the output, and there
    exist only the most obscure and system dependent hacks (matching against
    '['&char(10)&char(13)&']{2}') to remove the literally thousands of
    empty (or whitespaced filled) lines in a typical CF generated page. Without
    the abiltiy to find the true start or end of a line, its impossible to remove
    them.

    twillerror Guest

  2. Similar Questions and Discussions

    1. Regular Expression
      Hi, I am writing a script that parses an html file (which has been retrieved as a scalar by LWP::UserAgent). The script looks for everything in...
    2. Regular expression help
      Hi, I'm pretty new to regular expressions. Before, I used to write long-winded and buggy segments of code with PHPs string functions to extract...
    3. Regular expression for both first and last name?
      I'm new to regular expressions, can someone explain to me how I can write one that will check for 2 names, at least, for a name field? Thanks!...
    4. help on regular expression
      Hi, I need some help on regular expression... i have following in variable $total_count $total_count = "##I USBP 000001 10:38:09(000)...
    5. [PHP] REGULAR EXPRESSION HELP
      John wrote: Your "newline" may be \r\n or \r instead of just \n. -- ---John Holmes... Amazon Wishlist:...
  3. #2

    Default Re: Regular expression bug?

    [url]http://www.regular-expressions.info/anchors.html[/url]
    [url]http://www.regular-expressions.info/java.html[/url]

    From the first link:
    "Using ^ and $ as Start of Line and End of Line Anchors

    If you have a string consisting of multiple lines, like first line\nsecond
    line (where \n indicates a line break), it is often desirable to work with
    lines, rather than the entire string. Therefore, all the regex engines
    discussed in this tutorial have the option to expand the meaning of both
    anchors. ^ can then match at the start of the string (before the f in the above
    string), as well as after each line break (between \n and s). Likewise, $ will
    still match at the end of the string (after the last e), and also before every
    line break (between e and \n).

    In text editors like EditPad Pro or GNU Emacs, and regex tools like PowerGREP,
    the caret and dollar always match at the start and end of each line. This makes
    sense because those applications are designed to work with entire files, rather
    than short strings.

    In all programming languages and libraries discussed on this website , except
    Ruby, you have to explicitly activate this extended functionality. It is
    traditionally called "multi-line mode". In Perl, you do this by adding an m
    after the regex code, like this: m/^regex$/m;. In .NET, the anchors match
    before and after newlines when you specify RegexOptions.Multiline, such as in
    Regex.Match("string", "regex", RegexOptions.Multiline)."

    Kronin555 Guest

  4. #3

    Default Re: Regular expression bug?

    > All of CF's RE functions act in a weird way, contrary to the documentation
    > (both CF's own, and the underlying Java Regex docs). The special characters ^
    > and $ are defined as: 'Boundary matchers ^ The beginning of a line $ The end
    > of a line'
    Which CF docs are you reading?
    I'm reading these ones:
    [url]http://livedocs.macromedia.com/coldfusion/7/htmldocs/wwhelp/wwhimpl/common/html/wwhelp.htm?context=ColdFusion_Documentation&file=p art_cfm.htm[/url]

    Which say:
    "If the caret is at the beginning of a regular expression, the matched
    string must be at the beginning of the STRING being searched."

    (My emphasis).

    It says STRING, not LINE.

    And says same sort of thing for both ^ and $. And in both CFMX6.1 and
    CFMX7.

    > But, CF will only match the beginning or end of a STRING, not a
    > line within that string.
    You should read further down that page to where it discusses (?m), which is
    what you want for end-of-line, rather than end-of-string.

    --

    Adam
    Adam Cameron Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139