Ask a Question related to PHP Development, Design and Development.

  1. #1

    Default pattern match

    Where can I find infi or doc on "pattern match" used within WHERE clause
    (mysql).
    As I need to matche with PHP variables I'd prfer something adapted to PHP.
    In "PHP&MySQL Web Devlpt" (Luke Welling) I can't see much....


    Michel Guest

  2. Similar Questions and Discussions

    1. [ADMIN] Pattern Match
      It was Wednesday, December 10, 2003 when Rob Dixon took the soap box, saying: : Before I finally burst my cyanide capsule, may I.. ? No, you may...
    2. Pattern Match Question..
      I want to replace all instances of the combination of \" with \" Using the Regex code below, I end up replacing ALL " with &guot; and all...
    3. please help !! pattern match
      Hi , I need some help me to extract a pattern. The delimiters is a pair of "abcd" and "efgh". Can some one help me with an efficient use of Greedy...
    4. uninitialized value in pattern match
      #!/usr/bin/perl use warnings; use strict "refs"; use strict "subs"; use strict "vars"; our $netscape; $netscape = ($ENV{HTTP_USER_AGENT}...
    5. Pattern match with 2 conditions
      Stephan Bour <sbour@niaid.nih.gov> writes: use strict; # is your friend what's the point of this when you just set it back to "" below? ...
  3. #2

    Default Re: pattern match

    Michel wrote:
    > Where can I find infi or doc on "pattern match" used within WHERE clause
    > (mysql).
    > As I need to matche with PHP variables I'd prfer something adapted to PHP.
    > In "PHP&MySQL Web Devlpt" (Luke Welling) I can't see much....
    I take it you want the DB to do the matching not PHP.

    you want something like:

    select cans_of from canned_goods where lower(cans_of) LIKE lower('%beans%');

    this will get Beans, beans, Pork & Beans, Kidney Beans, etc.....

    here is a good resorce for SQL:
    [url]http://www.w3schools.com/sql/default.asp[/url]

    --
    /---+----+----+----+----+----+----++----+----+----+----+----+----+---\
    I [email]pham.nuwen3d6@libertydice.org[/email] II No nation was ever ruined by I
    I [url]http://www.libertydice.org[/url] II trade, even seemingly the most I
    I remove "3d6" to e-mail II disadvantageous. - Ben Franklin I
    \---+----+----+----+----+----+----++----+----+----+----+----+----+---/

    Pham Nuwen Guest

  4. #3

    Default Re: pattern match

    In article <biq71l$pcv$1@news-reader3.wanadoo.fr>,
    "Michel" <MicheldeVathaire@wanadoo.fr> wrote:
    > Where can I find infi or doc on "pattern match" used within WHERE clause
    > (mysql).
    > As I need to matche with PHP variables I'd prfer something adapted to PHP.
    > In "PHP&MySQL Web Devlpt" (Luke Welling) I can't see much....
    This describes MySQL's regular expression implementation:

    <http://www.mysql.com/doc/en/Regexp.html>

    JP

    --
    Sorry, <devnull@cauce.org> is een "spam trap".
    E-mail adres is <jpk"at"akamail.com>, waarbij "at" = @.
    Jan Pieter Kunst Guest

  5. #4

    Default Pattern Match

    Hi All,
    I am very new to Perl, but I sense a great adventure ahead after just
    programming with Cobol, Pascal, and C over the last umpteen years. I have
    written a perl script where I am trying to detect a non-printing
    character(Ctrl@ - Ctrl_) and then substitute a printing ASCII sequence such
    as "^@" in its place, but it does not seem to work as I would like. Any
    advice would be greatly appreciated.

    Thank You....Eric Sand


    $in_ctr=0;
    $out_ctr=0;

    while ($line = <STDIN>)
    {
    chomp($line);
    $in_ctr ++;
    if ($line = s/\c@,\cA,\cB,\cC,\cD,\cE,\cF,\cG,\cH,\cI,\cJ,\cK,
    \cL,\cM,\cN,\cO,\cP,\cQ,\cR,\cS,\cT,\cU,\cV,\cW,
    \cX,\cY,\cZ,\c[,\c\,\c],\c^,\c_
    /^@,^A,^B,^C,^D,^E,^F,^G,^H,^I,^J,^K,
    ^L,^N,^N,^O,^P,^Q,^R,^S,^T,^U,^V,^W,
    ^X,^Y,^Z,^[,^\,^],^^,^_/)
    {
    $out_ctr ++;
    printf("Non-printing chars detected in: %s\n",$line);
    }
    }
    printf("Total records read = %d\n",$in_ctr);
    printf("Total records written with non-printing characters =
    %d\n",$out_ctr);
    Eric Sand Guest

  6. #5

    Default Re: Pattern Match

    Eric Sand wrote:
    >
    > I am very new to Perl, but I sense a great adventure ahead after just
    > programming with Cobol, Pascal, and C over the last umpteen years. I have
    > written a perl script where I am trying to detect a non-printing
    > character(Ctrl@ - Ctrl_) and then substitute a printing ASCII sequence such
    > as "^@" in its place, but it does not seem to work as I would like. Any
    > advice would be greatly appreciated.
    >
    > Thank You....Eric Sand
    >
    >
    Your obvious guess is to write Perl as if it were C. That's slightly better
    than treating it as a scripting language, but there are many joys left to be
    found!
    > $in_ctr=0;
    > $out_ctr=0;
    >
    > while ($line = <STDIN>)
    > {
    > chomp($line);
    > $in_ctr ++;
    > if ($line = s/\c@,\cA,\cB,\cC,\cD,\cE,\cF,\cG,\cH,\cI,\cJ,\cK,
    > \cL,\cM,\cN,\cO,\cP,\cQ,\cR,\cS,\cT,\cU,\cV,\cW,
    > \cX,\cY,\cZ,\c[,\c\,\c],\c^,\c_
    > /^@,^A,^B,^C,^D,^E,^F,^G,^H,^I,^J,^K,
    > ^L,^N,^N,^O,^P,^Q,^R,^S,^T,^U,^V,^W,
    > ^X,^Y,^Z,^[,^\,^],^^,^_/)
    > {
    > $out_ctr ++;
    > printf("Non-printing chars detected in: %s\n",$line);
    > }
    > }
    > printf("Total records read = %d\n",$in_ctr);
    > printf("Total records written with non-printing characters = %d\n",$out_ctr);
    I would write this as below. The first things is to *always*

    use strict;
    use warnings;


    after which you have to declare all of your variables with 'my'.

    The second is to get used to using the default $_ variable which
    is set to the value for the current 'while(<>)' or 'for' loop
    iteration, and is a default parameter for most built-in functions.

    Finally, in your particular case you're using the s/// (substitute)
    operator wrongly. The first part, s/here//, is a regular expression,
    not a list of characters. You'll need to read up on these at

    perldoc perlre

    The second part, s//here/, is a string expression which can use
    'captured' sequences (anything in brackets) from the first part
    and, with the addition of the s///e (executable) qualifier can
    also be an executable statement. Here I've used it to add 0x20
    to the ASCII value of the control character grabbed by the regex.

    A lot of this won't make sense until you learn some more, but I
    hope you'll agree that this code is cuter than your original?

    HTH,

    Rob



    use strict;
    use warnings;

    my $in_ctr = 0;
    my $out_ctr = 0;

    while (<>) {

    chomp;

    $in_ctr++;

    if (s/([\x00-\1F])/'^'.chr(ord($1) + 0x40)/eg) {
    $out_ctr++;
    printf "Non-printing chars detected in: %s\n", $_;
    }
    }

    printf "Total records read = %d\n", $in_ctr;
    printf "Total records written with non-printing characters = %d\n", $out_ctr;


    Rob Dixon Guest

  7. #6

    Default RE: Pattern Match

    Rob, can you explain the details of that replace? That's pretty slick. I
    see you're adding the hex value to get to the appropriate ASCII value, but
    didn't know you could do some of that gyration inside a regex.

    Thanks.

    -Tom Kinzer

    -----Original Message-----
    From: Rob Dixon [mailto:rob@dixon.port995.com]
    Sent: Tuesday, December 09, 2003 11:58 AM
    To: [email]beginners@perl.org[/email]
    Subject: Re: Pattern Match


    Eric Sand wrote:
    >
    > I am very new to Perl, but I sense a great adventure ahead after just
    > programming with Cobol, Pascal, and C over the last umpteen years. I have
    > written a perl script where I am trying to detect a non-printing
    > character(Ctrl@ - Ctrl_) and then substitute a printing ASCII sequence
    such
    > as "^@" in its place, but it does not seem to work as I would like. Any
    > advice would be greatly appreciated.
    >
    > Thank You....Eric Sand
    >
    >
    Your obvious guess is to write Perl as if it were C. That's slightly better
    than treating it as a scripting language, but there are many joys left to be
    found!
    > $in_ctr=0;
    > $out_ctr=0;
    >
    > while ($line = <STDIN>)
    > {
    > chomp($line);
    > $in_ctr ++;
    > if ($line = s/\c@,\cA,\cB,\cC,\cD,\cE,\cF,\cG,\cH,\cI,\cJ,\cK,
    > \cL,\cM,\cN,\cO,\cP,\cQ,\cR,\cS,\cT,\cU,\cV,\cW,
    > \cX,\cY,\cZ,\c[,\c\,\c],\c^,\c_
    > /^@,^A,^B,^C,^D,^E,^F,^G,^H,^I,^J,^K,
    > ^L,^N,^N,^O,^P,^Q,^R,^S,^T,^U,^V,^W,
    > ^X,^Y,^Z,^[,^\,^],^^,^_/)
    > {
    > $out_ctr ++;
    > printf("Non-printing chars detected in: %s\n",$line);
    > }
    > }
    > printf("Total records read =
    %d\n",$in_ctr);
    > printf("Total records written with non-printing characters =
    %d\n",$out_ctr);

    I would write this as below. The first things is to *always*

    use strict;
    use warnings;


    after which you have to declare all of your variables with 'my'.

    The second is to get used to using the default $_ variable which
    is set to the value for the current 'while(<>)' or 'for' loop
    iteration, and is a default parameter for most built-in functions.

    Finally, in your particular case you're using the s/// (substitute)
    operator wrongly. The first part, s/here//, is a regular expression,
    not a list of characters. You'll need to read up on these at

    perldoc perlre

    The second part, s//here/, is a string expression which can use
    'captured' sequences (anything in brackets) from the first part
    and, with the addition of the s///e (executable) qualifier can
    also be an executable statement. Here I've used it to add 0x20
    to the ASCII value of the control character grabbed by the regex.

    A lot of this won't make sense until you learn some more, but I
    hope you'll agree that this code is cuter than your original?

    HTH,

    Rob



    use strict;
    use warnings;

    my $in_ctr = 0;
    my $out_ctr = 0;

    while (<>) {

    chomp;

    $in_ctr++;

    if (s/([\x00-\1F])/'^'.chr(ord($1) + 0x40)/eg) {
    $out_ctr++;
    printf "Non-printing chars detected in: %s\n", $_;
    }
    }

    printf "Total records read = %d\n", $in_ctr;
    printf "Total records written with non-printing characters = %d\n",
    $out_ctr;



    --
    To unsubscribe, e-mail: [email]beginners-unsubscribe@perl.org[/email]
    For additional commands, e-mail: [email]beginners-help@perl.org[/email]
    <http://learn.perl.org/> <http://learn.perl.org/first-response>


    Tom Kinzer Guest

  8. #7

    Default Re: Pattern Match

    On Dec 9, 2003, at 2:37 PM, Tom Kinzer wrote:
    > Rob, can you explain the details of that replace? That's pretty
    > slick. I
    > see you're adding the hex value to get to the appropriate ASCII value,
    > but
    > didn't know you could do some of that gyration inside a regex.
    The big secret there is the /e modifier at the end of that regex. That
    allows the use of Perl code (to be evaled) as the replacement string.
    You're right though, I thought it was slick too.

    James

    James Edward Gray II Guest

  9. #8

    Default Re: Pattern Match

    Tom Kinzer wrote:
    >
    > Rob Dixon wrote:
    > >
    > > Eric Sand wrote:
    > > >
    > > > I am very new to Perl, but I sense a great adventure ahead after just
    > > > programming with Cobol, Pascal, and C over the last umpteen years. I have
    > > > written a perl script where I am trying to detect a non-printing
    > > > character(Ctrl@ - Ctrl_) and then substitute a printing ASCII sequence
    > > such
    > > > as "^@" in its place, but it does not seem to work as I would like. Any
    > > > advice would be greatly appreciated.
    > > >
    > > > Thank You....Eric Sand
    > > >
    > > >
    > >
    > > Your obvious guess is to write Perl as if it were C. That's slightly better
    > > than treating it as a scripting language, but there are many joys left to be
    > > found!
    > >
    > > > $in_ctr=0;
    > > > $out_ctr=0;
    > > >
    > > > while ($line = <STDIN>)
    > > > {
    > > > chomp($line);
    > > > $in_ctr ++;
    > > > if ($line = s/\c@,\cA,\cB,\cC,\cD,\cE,\cF,\cG,\cH,\cI,\cJ,\cK,
    > > > \cL,\cM,\cN,\cO,\cP,\cQ,\cR,\cS,\cT,\cU,\cV,\cW,
    > > > \cX,\cY,\cZ,\c[,\c\,\c],\c^,\c_
    > > > /^@,^A,^B,^C,^D,^E,^F,^G,^H,^I,^J,^K,
    > > > ^L,^N,^N,^O,^P,^Q,^R,^S,^T,^U,^V,^W,
    > > > ^X,^Y,^Z,^[,^\,^],^^,^_/)
    > > > {
    > > > $out_ctr ++;
    > > > printf("Non-printing chars detected in: %s\n",$line);
    > > > }
    > > > }
    > > > printf("Total records read =
    > > %d\n",$in_ctr);
    > > > printf("Total records written with non-printing characters =
    > > %d\n",$out_ctr);
    > >
    > > I would write this as below. The first things is to *always*
    > >
    > > use strict;
    > > use warnings;
    > >
    > >
    > > after which you have to declare all of your variables with 'my'.
    > >
    > > The second is to get used to using the default $_ variable which
    > > is set to the value for the current 'while(<>)' or 'for' loop
    > > iteration, and is a default parameter for most built-in functions.
    > >
    > > Finally, in your particular case you're using the s/// (substitute)
    > > operator wrongly. The first part, s/here//, is a regular expression,
    > > not a list of characters. You'll need to read up on these at
    > >
    > > perldoc perlre
    > >
    > > The second part, s//here/, is a string expression which can use
    > > 'captured' sequences (anything in brackets) from the first part
    > > and, with the addition of the s///e (executable) qualifier can
    > > also be an executable statement. Here I've used it to add 0x20
    > > to the ASCII value of the control character grabbed by the regex.
    > >
    > > A lot of this won't make sense until you learn some more, but I
    > > hope you'll agree that this code is cuter than your original?
    > >
    > > HTH,
    > >
    > > Rob
    > >
    > >
    > >
    > > use strict;
    > > use warnings;
    > >
    > > my $in_ctr = 0;
    > > my $out_ctr = 0;
    > >
    > > while (<>) {
    > >
    > > chomp;
    > >
    > > $in_ctr++;
    > >
    > > if (s/([\x00-\1F])/'^'.chr(ord($1) + 0x40)/eg) {
    > > $out_ctr++;
    > > printf "Non-printing chars detected in: %s\n", $_;
    > > }
    > > }
    > >
    > > printf "Total records read = %d\n", $in_ctr;
    > > printf "Total records written with non-printing characters = %d\n",
    > > $out_ctr;
    >
    > Rob, can you explain the details of that replace? That's pretty slick. I
    > see you're adding the hex value to get to the appropriate ASCII value, but
    > didn't know you could do some of that gyration inside a regex.
    I didn't think it was slick at all. In fact I was disappointed that it looked
    such a mess, but I don't see a better way. Anyway, the statement is

    s/([\x00-\1F])/'^'.chr(ord($1) + 0x40)/eg

    where the regex is

    ([\x00-\1F])

    The enclosing parentheses capture the entire regex as $1 for use later
    in the replacement expression or even in a later statement. Within that
    is a character class [ .. ] which is simply all control characters. It's
    the first 'column' of the 7-bit 128-character ASCII set with byte values
    0 through 31 or 0x00 through 0x1F. It would be better expressed as

    [[:cntrl:]]

    which is identical but describes what you /mean/ rather than how your
    machine should do it.

    OK, so we've captured one control character into $1. Then comes the
    replacement string, which can be an executable expression with the /e
    modifier on the substitution. Note that for simple interpolation of
    variables like the captured $1, $2 etc, and in fact any variable
    (including arrays and hashes) in scope, there is no need for /e. It is
    only necessary if there are operators or subroutines that need to
    be executed to build the replacement string.

    It's a mess because there is no way of relating control characters
    (e.g. CR) with their alphabetic equivalents (e.g. CTRL/M) without
    doing character arithmetic. And that's not what characters do in
    /real/ life.

    In

    '^'.chr(ord($1) + 0x40)

    ord($1) returns the byte value of the control character.

    + 0x40

    moves that byte value from the first column (control characters) to the
    third column (capital alphas)

    chr()

    turns that byte value back into a one-character ASCII string.

    '^'.

    appends a caret before that character. Hence "\cM" becomes
    '^M'.

    All that is left is the /g modifier, which simply replaces
    all instances of the regex instead of just the first one found.

    I hope this helps. It's useful for me to tie down my
    programming to first principles once in a while and ask
    /why/ did I write that?

    Cheers guys.

    Rob


    Rob Dixon Guest

  10. #9

    Default Re: Pattern Match

    Rob Dixon writes:
    > Tom Kinzer wrote:
    > I didn't think it was slick at all. In fact I was disappointed that
    > it looked such a mess, but I don't see a better way.
    Yes, it is indeed a mess, not only syntacticly, but also semantically.
    While it might make a good teaching example to show what you can do in
    a perl regex, it might not be a very good way to do what is ultimately
    accomplished.

    First, a regular expression pattern match is conducted to find all
    chars in the string that are in the desired "special processing"
    range. Note that these are each individual characters, not
    substrings, so the regex match is gross overkill from a computational
    complexity point of view.

    Second, all that is desired is to insert a circumflex and then the
    character plus a bias to make it printable.

    Now if this is all that has to be done, and you want to do it to a
    bunch of large files, then the way you show is a poor way to do it. A
    simple C program could be written to get a character from stdin, check
    it in an "if" statement to see if it is in the desired range, and then
    output the circumflex followied by the biased character to stdout if
    it is in the range, or else just output the character. This simple
    one-char-at-a-time streaming filter approach would be considerably
    simpler computationally than the method you provide.

    Now if you only need to do this to massage a few lines of output in a
    program with a much larger overall purpose, then perhaps your example
    is the way to go.

    My question is, how does perl's regex compiler handle the code you
    gave? Does it optimize it to a similar level of complexity as my C
    example, or does it smash it with a one-size-fits-all regular
    expression engine? I know regular expressions can be highly optimized
    at compile time, so this is an important question. If the regex is
    sufficiently optimized, then it would always be the way to go.

    Bob "Rj" Brown

    --
    -------- "And there came a writing to him from Elijah" [2Ch 21:12] --------
    R. J. Brown III [email]rj@elilabs.com[/email] [url]http://www.elilabs.com/~rj[/url] voice 847 543-4060
    Elijah Laboratories Inc. 457 Signal Lane, Grayslake IL 60030 fax 847 543-4061
    ----- M o d e l i n g t h e M e t h o d s o f t h e M i n d ------
    Robert Brown Guest

  11. #10

    Default Re: Pattern Match

    Robert Brown wrote:
    >
    > Rob Dixon writes:
    >
    > > Tom Kinzer wrote:
    >
    > > I didn't think it was slick at all. In fact I was disappointed that
    > > it looked such a mess, but I don't see a better way.
    >
    > Yes, it is indeed a mess, not only syntacticly, but also semantically.
    > While it might make a good teaching example to show what you can do in
    > a perl regex, it might not be a very good way to do what is ultimately
    > accomplished.
    >
    > First, a regular expression pattern match is conducted to find all
    > chars in the string that are in the desired "special processing"
    > range. Note that these are each individual characters, not
    > substrings, so the regex match is gross overkill from a computational
    > complexity point of view.
    >
    > Second, all that is desired is to insert a circumflex and then the
    > character plus a bias to make it printable.
    >
    > Now if this is all that has to be done, and you want to do it to a
    > bunch of large files, then the way you show is a poor way to do it. A
    > simple C program could be written to get a character from stdin, check
    > it in an "if" statement to see if it is in the desired range, and then
    > output the circumflex followied by the biased character to stdout if
    > it is in the range, or else just output the character. This simple
    > one-char-at-a-time streaming filter approach would be considerably
    > simpler computationally than the method you provide.
    >
    > Now if you only need to do this to massage a few lines of output in a
    > program with a much larger overall purpose, then perhaps your example
    > is the way to go.
    >
    > My question is, how does perl's regex compiler handle the code you
    > gave? Does it optimize it to a similar level of complexity as my C
    > example, or does it smash it with a one-size-fits-all regular
    > expression engine? I know regular expressions can be highly optimized
    > at compile time, so this is an important question. If the regex is
    > sufficiently optimized, then it would always be the way to go.
    Thanks Robert, but I wonder if you expect us to take you seriously?
    In which case I'll happily reply.

    Rob



    Rob Dixon Guest

  12. #11

    Default Re: Pattern Match

    Rob Dixon writes:
    > Robert Brown wrote:
    > >
    > > Rob Dixon writes:
    > >
    > > > Tom Kinzer wrote:
    > >
    > > > I didn't think it was slick at all. In fact I was disappointed that
    > > > it looked such a mess, but I don't see a better way.
    > >
    > > Yes, it is indeed a mess, not only syntacticly, but also semantically.
    > > While it might make a good teaching example to show what you can do in
    > > a perl regex, it might not be a very good way to do what is ultimately
    > > accomplished.
    [ some words deleted here ... ]
    > > My question is, how does perl's regex compiler handle the code you
    > > gave? Does it optimize it to a similar level of complexity as my C
    > > example, or does it smash it with a one-size-fits-all regular
    > > expression engine? I know regular expressions can be highly optimized
    > > at compile time, so this is an important question. If the regex is
    > > sufficiently optimized, then it would always be the way to go.
    >
    > Thanks Robert, but I wonder if you expect us to take you seriously?
    > In which case I'll happily reply.
    >
    > Rob
    Yes! Please take my request seriously. I hope you can show me that
    the regex approach you used pays no penalty other than perhaps a few
    extra miliseconds of compilation time, and that it executes very
    efficiently. That is what I want to see. I know it *CAN*
    (theoretically) be done; I am just wondering if it indeed has been
    done.

    Rj
    Robert Brown Guest

  13. #12

    Default Re: Pattern Match

    Robert Brown wrote:
    >
    > Rob Dixon writes:
    > > Robert Brown wrote:
    > > >
    > > > Rob Dixon writes:
    > > >
    > > > > Tom Kinzer wrote:
    > > >
    > > > > I didn't think it was slick at all. In fact I was disappointed that
    > > > > it looked such a mess, but I don't see a better way.
    > > >
    > > > Yes, it is indeed a mess, not only syntacticly, but also semantically.
    > > > While it might make a good teaching example to show what you can do in
    > > > a perl regex, it might not be a very good way to do what is ultimately
    > > > accomplished.
    >
    > [ some words deleted here ... ]
    >
    > > > My question is, how does perl's regex compiler handle the code you
    > > > gave? Does it optimize it to a similar level of complexity as my C
    > > > example, or does it smash it with a one-size-fits-all regular
    > > > expression engine? I know regular expressions can be highly optimized
    > > > at compile time, so this is an important question. If the regex is
    > > > sufficiently optimized, then it would always be the way to go.
    > >
    > > Thanks Robert, but I wonder if you expect us to take you seriously?
    > > In which case I'll happily reply.
    > >
    > > Rob
    >
    > Yes! Please take my request seriously. I hope you can show me that
    > the regex approach you used pays no penalty other than perhaps a few
    > extra miliseconds of compilation time, and that it executes very
    > efficiently. That is what I want to see. I know it *CAN*
    > (theoretically) be done; I am just wondering if it indeed has been
    > done.
    Yes, I'd gladly trade in my Honda 750cc for a lightcycle: I know
    it *CAN* (theoretically) be done.

    I'm sure you have something useful to say. This seems such a waste of
    your effort.

    Rob


    Rob Dixon Guest

  14. #13

    Default Re: Pattern Match

    Eric Sand wrote:
    >
    > Hi All,
    Hello,
    > I am very new to Perl,
    Welcome. :-)
    > but I sense a great adventure ahead after just
    > programming with Cobol, Pascal, and C over the last umpteen years.
    "Thinking in Perl" may take a while but it is not your grandfather's
    programming language (sorry COBOL.)
    > I have
    > written a perl script where I am trying to detect a non-printing
    > character(Ctrl@ - Ctrl_)
    Your idea of non-printing seems to conflict with industry standards as
    CtrlG - CtrlM are all printable. Also you are using perl's standard
    readline and chomp()ing the input so you are not converting the CtrlJ
    character at all.
    > and then substitute a printing ASCII sequence such
    > as "^@" in its place, but it does not seem to work as I would like. Any
    > advice would be greatly appreciated.

    use warnings;
    use strict;
    > $in_ctr=0;
    > $out_ctr=0;
    Whitespace is free and makes your code more readable and maintainable.

    my $in_ctr = 0;
    my $out_ctr = 0;

    > while ($line = <STDIN>)
    > {
    > chomp($line);
    > $in_ctr ++;
    > if ($line = s/\c@,\cA,\cB,\cC,\cD,\cE,\cF,\cG,\cH,\cI,\cJ,\cK,
    > \cL,\cM,\cN,\cO,\cP,\cQ,\cR,\cS,\cT,\cU,\cV,\cW,
    > \cX,\cY,\cZ,\c[,\c\,\c],\c^,\c_
    > /^@,^A,^B,^C,^D,^E,^F,^G,^H,^I,^J,^K,
    > ^L,^N,^N,^O,^P,^Q,^R,^S,^T,^U,^V,^W,
    > ^X,^Y,^Z,^[,^\,^],^^,^_/)
    As Rob pointed out, this is not the correct way to use the substitution
    operator (see below.)
    > {
    > $out_ctr ++;
    > printf("Non-printing chars detected in: %s\n",$line);
    You shouldn't use printf unless you really have to and in this case you
    don't really have to.

    print "Non-printing chars detected in: $line\n";

    > }
    > }
    > printf("Total records read = %d\n",$in_ctr);
    > printf("Total records written with non-printing characters =
    > %d\n",$out_ctr);
    print "Total records read = $in_ctr\n";
    print "Total records written with non-printing characters = $out_ctr\n";


    I would probably write it like this:

    use warnings;
    use strict;

    my $out_ctr = 0;

    while ( <STDIN> ) {
    next unless s/([[:cntrl:]])/'^' . ( $1 | "\x40" )/eg;
    $out_ctr++;
    print "Non-printing chars detected in: $_\n";
    }
    print "Total records read = $.\n";
    print "Total records written with non-printing characters = $out_ctr\n";

    __END__



    John
    --
    use Perl;
    program
    fulfillment
    John W. Krahn Guest

  15. #14

    Default Re: Pattern Match

    Rob Dixon wrote:
    >
    > I didn't think it was slick at all. In fact I was disappointed that it looked
    > such a mess, but I don't see a better way. Anyway, the statement is
    >
    > s/([\x00-\1F])/'^'.chr(ord($1) + 0x40)/eg
    >
    > where the regex is
    >
    > ([\x00-\1F])
    Oops. That matches the characters "\0", "\1" and 'F'. It should be
    ([\x00-\x1F])

    :-)


    John
    --
    use Perl;
    program
    fulfillment
    John W. Krahn Guest

  16. #15

    Default Re: Pattern Match

    Rob Dixon writes:
    > I'm sure you have something useful to say. This seems such a waste of
    > your effort.
    >
    > Rob
    I think we are failing to communicate. What I am asking is:

    "Does the regular expression mechanism in perl optimize regular
    expressions such as the one you used earlier in this thread so that
    the execution overhead is nearly as good as the C approach I outlined
    earlier in this thread? In other words, for the problem stated
    earlier, does o(C) = o(perl)?

    Can I really use regular expressions as my main tool for scanning and
    modifying strings and expect to get speeds comparable to what I would
    get with hand tailored code? I hope so, because that would be
    wonderful.
    Robert Brown Guest

  17. #16

    Default Re: Pattern Match

    It was Tuesday, December 09, 2003 when Robert Brown took the soap box, saying:
    : Rob Dixon writes:
    : > I'm sure you have something useful to say. This seems such a waste of
    : > your effort.
    : >
    : > Rob
    :
    : I think we are failing to communicate. What I am asking is:

    Thanks for the clarification, this was getting a bit out of hand. :-)

    : "Does the regular expression mechanism in perl optimize regular
    : expressions such as the one you used earlier in this thread so that
    : the execution overhead is nearly as good as the C approach I outlined
    : earlier in this thread? In other words, for the problem stated
    : earlier, does o(C) = o(perl)?

    The answer is, C almost always going to be much faster almost all the
    time, YMMV. Really the only way to tell is with tests and benchmarks,
    but you can almost always bet on C.

    : Can I really use regular expressions as my main tool for scanning and
    : modifying strings and expect to get speeds comparable to what I would
    : get with hand tailored code? I hope so, because that would be
    : wonderful.

    You could, however, make very good use of some builtin Perl functions
    like substr(), length(), pos(), index(), rindex(), study(), and so
    on. No regular expressions, but you don't always need them.


    Casey West

    --
    I'd rather listen to Newton than to Mundie. He may have been dead for
    almost three hundred years, but despite that he stinks up the room
    less.
    -- Linus Torvalds

    Casey West Guest

  18. #17

    Default Re: Pattern Match

    Casey West writes:
    > : "Does the regular expression mechanism in perl optimize regular
    > : expressions such as the one you used earlier in this thread so that
    > : the execution overhead is nearly as good as the C approach I outlined
    > : earlier in this thread? In other words, for the problem stated
    > : earlier, does o(C) = o(perl)?
    >
    > The answer is, C almost always going to be much faster almost all the
    > time, YMMV. Really the only way to tell is with tests and benchmarks,
    > but you can almost always bet on C.
    Sorry again for my confusing way of expressing myself. Although I
    wrote my example in C, that was because I am a novice perl programmer,
    but an experienced C programmer, so I expressed my algorithm in C.

    The idea was to compare the execution effeciency of a perl regular
    expression approach to a less syntacticly compact algorithmic approach
    using loops and conditionals, still written in perl, to edit the
    string. I just used C so you all would not beat me up over perl
    syntax details instead of answering the real question.

    Is perl going to be comparably efficient whichever way you code it, or
    is the explicit test and loop approach usually going to be faster for
    simple jobs? I want to know when to use the regex approach and when
    not to.

    Yes, I realize that much of the time it is not computer time one needs
    to optimize for, but programmer time, especially for "throw-away"
    hacks that only take miliseconds to run anyway.

    For the big jobs, one must be more careful. For example, right now I
    am working on a set of scripts to implement a production backup
    server. A live test run mucks around with about 100 GB of files over
    a network that I wish was faster, stuffing those files safely into a
    "negative time" image on a RAID5 array, and takes about 6 hours to
    run. It is a good idea to try to make such long jobs run quicker!
    Robert Brown Guest

  19. #18

    Default Re: Pattern Match

    Since we're now talking about performance issues, somebody should say something about precompiling the regular expression, when you can, with either /o or qr(). I had a process's running time go from 2min 45sec to just under 24sec simply by using qr on the relevant regular expressions.


    Robert Brown <eli@xnet.com> wrote:
    >Rob Dixon writes:
    > > I'm sure you have something useful to say. This seems such a waste of
    > > your effort.
    > >
    > > Rob
    >
    >I think we are failing to communicate. *What I am asking is:
    >
    >"Does the regular expression mechanism in perl optimize regular
    >expressions such as the one you used earlier in this thread so that
    >the execution overhead is nearly as good as the C approach I outlined
    >earlier in this thread? *In other words, for the problem stated
    >earlier, does o(C) = o(perl)? *
    >
    >Can I really use regular expressions as my main tool for scanning and
    >modifying strings and expect to get speeds comparable to what I would
    >get with hand tailored code? *I hope so, because that would be
    >wonderful.
    >
    >--
    >To unsubscribe, e-mail: [email]beginners-unsubscribe@perl.org[/email]
    >For additional commands, e-mail: [email]beginners-help@perl.org[/email]
    ><http://learn.perl.org/> <http://learn.perl.org/first-response>
    >
    >
    >
    __________________________________________________ ________________
    McAfee VirusScan Online from the Netscape Network.
    Comprehensive protection for your entire computer. Get your free trial today!
    [url]http://channels.netscape.com/ns/computing/mcafee/index.jsp?promo=393397[/url]

    Get AOL Instant Messenger 5.1 free of charge. Download Now!
    [url]http://aim.aol.com/aimnew/Aim/register.adp?promo=380455[/url]
    mcdavis941@netscape.net Guest

  20. #19

    Default Re: Pattern Match

    Before I finally burst my cyanide capsule, may I.. ?

    Rj wrote:
    >
    > Rob Dixon writes:
    > >
    > > I didn't think it was slick at all. In fact I was
    > > disappointed that it looked such a mess, but I don't see
    > > a better way.
    >
    > Yes, it is indeed a mess, not only syntacticly, but also
    > semantically.
    What is a syntactic mess? And, even more obscurely, what is a
    semantic mess?
    > While it might make a good teaching example to show what you
    > can do in a perl regex, it might not be a very good way to do
    > what is ultimately accomplished.
    The stark realisation is that an infinite majority of problems
    have no solution at all. This is a Perl newsgroup.
    > First, a regular expression pattern match is conducted to find
    > all chars in the string that are in the desired "special
    > processing" range. Note that these are each individual
    > characters, not substrings, so the regex match is gross
    > overkill from a computational complexity point of view.
    >
    > Second, all that is desired is to insert a circumflex and then
    > the character plus a bias to make it printable.
    I've never before seen a software solution reverse-engineered as
    far as the documentation plus obfuscations! As far as possible a
    piece of software should be a description of what is to be done:
    that is what compilers/interpreters/assemblers/shell languages
    are for. Ideally what I should be able to write is:

    replace all control characters with their printable
    equivalents

    It is only the rigour of programming languages that prevents
    this. And why most companies still employ people.
    > Now if this is all that has to be done, and you want to do it
    > to a bunch of large files, then the way you show is a poor way
    > to do it.
    Now

    "Yes, it is indeed a mess, not only syntacticly, but also
    semantically."

    and

    "the way you show is a poor way to do it"

    is downright rude. Especially without an alternative option.

    Do you want to be taken seriously or what?


    I wrote an algorithm. If you have a problem with how well
    (in whatever sense) a computer executes that algorithm then you
    have an issue with the originators of the language and its
    implementors. I for one think that Perl is one of the best-
    conceived languages and certainly the best choice for any stand-
    alone program.
    > A simple C program could be written to get a character from
    > stdin, check it in an "if" statement to see if it is in the
    > desired range, and then output the circumflex followied by the
    > biased character to stdout if it is in the range, or else just
    > output the character. This simple one-char-at-a-time streaming
    > filter approach would be considerably simpler computationally
    > than the method you provide.
    How are you so sure that that's not how my algorithm is
    implemented by the compiler?
    > Now if you only need to do this to massage a few lines of
    > output in a program with a much larger overall purpose, then
    > perhaps your example is the way to go.
    Or perhaps it's the best way to go anyway?
    > My question is, how does perl's regex compiler handle the code
    > you gave? Does it optimize it to a similar level of
    > complexity as my C example, or does it smash it with a one-
    > size-fits-all regular expression engine? I know regular
    > expressions can be highly optimized at compile time, so this
    > is an important question. If the regex is sufficiently
    > optimized, then it would always be the way to go.
    'Sufficiently'? Why do you need to know? I know very well that I
    can write something in Intel assembler that will perform far
    faster than your C program. But I don't need to. I still don't
    understand your point. Just how fast do you need this thing to
    go? Why not just put stripes on it?



    Rob

    BTW have you read the context of your sig?

    Rj wrote:
    >
    > -------- "And there came a writing to him from Elijah" [2Ch 21:12] --------
    > R. J. Brown III [email]rj@elilabs.com[/email] [url]http://www.elilabs.com/~rj[/url] voice 847 543-4060
    > Elijah Laboratories Inc. 457 Signal Lane, Grayslake IL 60030 fax 847 543-4061
    > ----- M o d e l i n g t h e M e t h o d s o f t h e M i n d ------
    2 Chronicles 21:12,13

    Jehoram received a letter from Elijah the prophet, which said:

    "This is what the LORD, the God of your father David, says: 'You
    have not walked in the ways of your father Jehoshaphat or of Asa
    king of Judah. But you have walked in the ways of the kings of
    Israel, and you have led Judah and the people of Jerusalem to
    prostitute themselves, just as the house of Ahab did. You have
    also murdered your own brothers, members of your father's house,
    men who were better than you.


    Rob Dixon Guest

  21. #20

    Default Re: Pattern Match

    From: Robert Brown <eli@xnet.com>
    > Casey West writes:
    > > : "Does the regular expression mechanism in perl optimize regular >
    > : expressions such as the one you used earlier in this thread so that
    > > : the execution overhead is nearly as good as the C approach I
    > outlined > : earlier in this thread? In other words, for the problem
    > stated > : earlier, does o(C) = o(perl)? > > The answer is, C almost
    > always going to be much faster almost all the > time, YMMV. Really
    > the only way to tell is with tests and benchmarks, > but you can
    > almost always bet on C.
    >
    > Sorry again for my confusing way of expressing myself. Although I
    > wrote my example in C, that was because I am a novice perl programmer,
    > but an experienced C programmer, so I expressed my algorithm in C.
    >
    > The idea was to compare the execution effeciency of a perl regular
    > expression approach to a less syntacticly compact algorithmic approach
    > using loops and conditionals, still written in perl, to edit the
    > string. I just used C so you all would not beat me up over perl
    > syntax details instead of answering the real question.
    >
    > Is perl going to be comparably efficient whichever way you code it, or
    > is the explicit test and loop approach usually going to be faster for
    > simple jobs? I want to know when to use the regex approach and when
    > not to.
    1. Perl builtins and especialy the regular expression engine is
    heavily optimized. So it might very well be quicker to use a regexp
    from Perl than to implement the same stuff in C. Unless you spend a
    lot of time tweaking the code.

    2. One regexp (assuming its created well) will almost always be
    quicker than several loops and ifs in Perl.

    While you should not use a regexp where the "normal" functions
    suffice, you should not go into great lengths implementing something
    that would be simple as a regexp. It'll be harder to maintain and
    most probably slower.

    3. If you really need to know which solution is quicker
    use Benchmark;


    Jenda
    ===== [email]Jenda@Krynicky.cz[/email] === [url]http://Jenda.Krynicky.cz[/url] =====
    When it comes to wine, women and song, wizards are allowed
    to get drunk and croon as much as they like.
    -- Terry Pratchett in Sourcery

    Jenda Krynicky Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139