Ask a Question related to PHP Development, Design and Development.
-
Michel #1
pattern match
Where can I find infi or doc on "pattern match" used within WHERE clause
(mysql).
As I need to matche with PHP variables I'd prfer something adapted to PHP.
In "PHP&MySQL Web Devlpt" (Luke Welling) I can't see much....
Michel Guest
-
[ADMIN] Pattern Match
It was Wednesday, December 10, 2003 when Rob Dixon took the soap box, saying: : Before I finally burst my cyanide capsule, may I.. ? No, you may... -
Pattern Match Question..
I want to replace all instances of the combination of \" with \" Using the Regex code below, I end up replacing ALL " with &guot; and all... -
please help !! pattern match
Hi , I need some help me to extract a pattern. The delimiters is a pair of "abcd" and "efgh". Can some one help me with an efficient use of Greedy... -
uninitialized value in pattern match
#!/usr/bin/perl use warnings; use strict "refs"; use strict "subs"; use strict "vars"; our $netscape; $netscape = ($ENV{HTTP_USER_AGENT}... -
Pattern match with 2 conditions
Stephan Bour <sbour@niaid.nih.gov> writes: use strict; # is your friend what's the point of this when you just set it back to "" below? ... -
Pham Nuwen #2
Re: pattern match
Michel wrote:
I take it you want the DB to do the matching not PHP.> Where can I find infi or doc on "pattern match" used within WHERE clause
> (mysql).
> As I need to matche with PHP variables I'd prfer something adapted to PHP.
> In "PHP&MySQL Web Devlpt" (Luke Welling) I can't see much....
you want something like:
select cans_of from canned_goods where lower(cans_of) LIKE lower('%beans%');
this will get Beans, beans, Pork & Beans, Kidney Beans, etc.....
here is a good resorce for SQL:
[url]http://www.w3schools.com/sql/default.asp[/url]
--
/---+----+----+----+----+----+----++----+----+----+----+----+----+---\
I [email]pham.nuwen3d6@libertydice.org[/email] II No nation was ever ruined by I
I [url]http://www.libertydice.org[/url] II trade, even seemingly the most I
I remove "3d6" to e-mail II disadvantageous. - Ben Franklin I
\---+----+----+----+----+----+----++----+----+----+----+----+----+---/
Pham Nuwen Guest
-
Jan Pieter Kunst #3
Re: pattern match
In article <biq71l$pcv$1@news-reader3.wanadoo.fr>,
"Michel" <MicheldeVathaire@wanadoo.fr> wrote:
This describes MySQL's regular expression implementation:> Where can I find infi or doc on "pattern match" used within WHERE clause
> (mysql).
> As I need to matche with PHP variables I'd prfer something adapted to PHP.
> In "PHP&MySQL Web Devlpt" (Luke Welling) I can't see much....
<http://www.mysql.com/doc/en/Regexp.html>
JP
--
Sorry, <devnull@cauce.org> is een "spam trap".
E-mail adres is <jpk"at"akamail.com>, waarbij "at" = @.
Jan Pieter Kunst Guest
-
Eric Sand #4
Pattern Match
Hi All,
I am very new to Perl, but I sense a great adventure ahead after just
programming with Cobol, Pascal, and C over the last umpteen years. I have
written a perl script where I am trying to detect a non-printing
character(Ctrl@ - Ctrl_) and then substitute a printing ASCII sequence such
as "^@" in its place, but it does not seem to work as I would like. Any
advice would be greatly appreciated.
Thank You....Eric Sand
$in_ctr=0;
$out_ctr=0;
while ($line = <STDIN>)
{
chomp($line);
$in_ctr ++;
if ($line = s/\c@,\cA,\cB,\cC,\cD,\cE,\cF,\cG,\cH,\cI,\cJ,\cK,
\cL,\cM,\cN,\cO,\cP,\cQ,\cR,\cS,\cT,\cU,\cV,\cW,
\cX,\cY,\cZ,\c[,\c\,\c],\c^,\c_
/^@,^A,^B,^C,^D,^E,^F,^G,^H,^I,^J,^K,
^L,^N,^N,^O,^P,^Q,^R,^S,^T,^U,^V,^W,
^X,^Y,^Z,^[,^\,^],^^,^_/)
{
$out_ctr ++;
printf("Non-printing chars detected in: %s\n",$line);
}
}
printf("Total records read = %d\n",$in_ctr);
printf("Total records written with non-printing characters =
%d\n",$out_ctr);
Eric Sand Guest
-
Rob Dixon #5
Re: Pattern Match
Eric Sand wrote:
Your obvious guess is to write Perl as if it were C. That's slightly better>
> I am very new to Perl, but I sense a great adventure ahead after just
> programming with Cobol, Pascal, and C over the last umpteen years. I have
> written a perl script where I am trying to detect a non-printing
> character(Ctrl@ - Ctrl_) and then substitute a printing ASCII sequence such
> as "^@" in its place, but it does not seem to work as I would like. Any
> advice would be greatly appreciated.
>
> Thank You....Eric Sand
>
>
than treating it as a scripting language, but there are many joys left to be
found!
I would write this as below. The first things is to *always*> $in_ctr=0;
> $out_ctr=0;
>
> while ($line = <STDIN>)
> {
> chomp($line);
> $in_ctr ++;
> if ($line = s/\c@,\cA,\cB,\cC,\cD,\cE,\cF,\cG,\cH,\cI,\cJ,\cK,
> \cL,\cM,\cN,\cO,\cP,\cQ,\cR,\cS,\cT,\cU,\cV,\cW,
> \cX,\cY,\cZ,\c[,\c\,\c],\c^,\c_
> /^@,^A,^B,^C,^D,^E,^F,^G,^H,^I,^J,^K,
> ^L,^N,^N,^O,^P,^Q,^R,^S,^T,^U,^V,^W,
> ^X,^Y,^Z,^[,^\,^],^^,^_/)
> {
> $out_ctr ++;
> printf("Non-printing chars detected in: %s\n",$line);
> }
> }
> printf("Total records read = %d\n",$in_ctr);
> printf("Total records written with non-printing characters = %d\n",$out_ctr);
use strict;
use warnings;
after which you have to declare all of your variables with 'my'.
The second is to get used to using the default $_ variable which
is set to the value for the current 'while(<>)' or 'for' loop
iteration, and is a default parameter for most built-in functions.
Finally, in your particular case you're using the s/// (substitute)
operator wrongly. The first part, s/here//, is a regular expression,
not a list of characters. You'll need to read up on these at
perldoc perlre
The second part, s//here/, is a string expression which can use
'captured' sequences (anything in brackets) from the first part
and, with the addition of the s///e (executable) qualifier can
also be an executable statement. Here I've used it to add 0x20
to the ASCII value of the control character grabbed by the regex.
A lot of this won't make sense until you learn some more, but I
hope you'll agree that this code is cuter than your original?
HTH,
Rob
use strict;
use warnings;
my $in_ctr = 0;
my $out_ctr = 0;
while (<>) {
chomp;
$in_ctr++;
if (s/([\x00-\1F])/'^'.chr(ord($1) + 0x40)/eg) {
$out_ctr++;
printf "Non-printing chars detected in: %s\n", $_;
}
}
printf "Total records read = %d\n", $in_ctr;
printf "Total records written with non-printing characters = %d\n", $out_ctr;
Rob Dixon Guest
-
Tom Kinzer #6
RE: Pattern Match
Rob, can you explain the details of that replace? That's pretty slick. I
see you're adding the hex value to get to the appropriate ASCII value, but
didn't know you could do some of that gyration inside a regex.
Thanks.
-Tom Kinzer
-----Original Message-----
From: Rob Dixon [mailto:rob@dixon.port995.com]
Sent: Tuesday, December 09, 2003 11:58 AM
To: [email]beginners@perl.org[/email]
Subject: Re: Pattern Match
Eric Sand wrote:such>
> I am very new to Perl, but I sense a great adventure ahead after just
> programming with Cobol, Pascal, and C over the last umpteen years. I have
> written a perl script where I am trying to detect a non-printing
> character(Ctrl@ - Ctrl_) and then substitute a printing ASCII sequenceYour obvious guess is to write Perl as if it were C. That's slightly better> as "^@" in its place, but it does not seem to work as I would like. Any
> advice would be greatly appreciated.
>
> Thank You....Eric Sand
>
>
than treating it as a scripting language, but there are many joys left to be
found!
%d\n",$in_ctr);> $in_ctr=0;
> $out_ctr=0;
>
> while ($line = <STDIN>)
> {
> chomp($line);
> $in_ctr ++;
> if ($line = s/\c@,\cA,\cB,\cC,\cD,\cE,\cF,\cG,\cH,\cI,\cJ,\cK,
> \cL,\cM,\cN,\cO,\cP,\cQ,\cR,\cS,\cT,\cU,\cV,\cW,
> \cX,\cY,\cZ,\c[,\c\,\c],\c^,\c_
> /^@,^A,^B,^C,^D,^E,^F,^G,^H,^I,^J,^K,
> ^L,^N,^N,^O,^P,^Q,^R,^S,^T,^U,^V,^W,
> ^X,^Y,^Z,^[,^\,^],^^,^_/)
> {
> $out_ctr ++;
> printf("Non-printing chars detected in: %s\n",$line);
> }
> }
> printf("Total records read =%d\n",$out_ctr);> printf("Total records written with non-printing characters =
I would write this as below. The first things is to *always*
use strict;
use warnings;
after which you have to declare all of your variables with 'my'.
The second is to get used to using the default $_ variable which
is set to the value for the current 'while(<>)' or 'for' loop
iteration, and is a default parameter for most built-in functions.
Finally, in your particular case you're using the s/// (substitute)
operator wrongly. The first part, s/here//, is a regular expression,
not a list of characters. You'll need to read up on these at
perldoc perlre
The second part, s//here/, is a string expression which can use
'captured' sequences (anything in brackets) from the first part
and, with the addition of the s///e (executable) qualifier can
also be an executable statement. Here I've used it to add 0x20
to the ASCII value of the control character grabbed by the regex.
A lot of this won't make sense until you learn some more, but I
hope you'll agree that this code is cuter than your original?
HTH,
Rob
use strict;
use warnings;
my $in_ctr = 0;
my $out_ctr = 0;
while (<>) {
chomp;
$in_ctr++;
if (s/([\x00-\1F])/'^'.chr(ord($1) + 0x40)/eg) {
$out_ctr++;
printf "Non-printing chars detected in: %s\n", $_;
}
}
printf "Total records read = %d\n", $in_ctr;
printf "Total records written with non-printing characters = %d\n",
$out_ctr;
--
To unsubscribe, e-mail: [email]beginners-unsubscribe@perl.org[/email]
For additional commands, e-mail: [email]beginners-help@perl.org[/email]
<http://learn.perl.org/> <http://learn.perl.org/first-response>
Tom Kinzer Guest
-
James Edward Gray II #7
Re: Pattern Match
On Dec 9, 2003, at 2:37 PM, Tom Kinzer wrote:
The big secret there is the /e modifier at the end of that regex. That> Rob, can you explain the details of that replace? That's pretty
> slick. I
> see you're adding the hex value to get to the appropriate ASCII value,
> but
> didn't know you could do some of that gyration inside a regex.
allows the use of Perl code (to be evaled) as the replacement string.
You're right though, I thought it was slick too.
James
James Edward Gray II Guest
-
Rob Dixon #8
Re: Pattern Match
Tom Kinzer wrote:
I didn't think it was slick at all. In fact I was disappointed that it looked>
> Rob Dixon wrote:>> >
> > Eric Sand wrote:> > such> > >
> > > I am very new to Perl, but I sense a great adventure ahead after just
> > > programming with Cobol, Pascal, and C over the last umpteen years. I have
> > > written a perl script where I am trying to detect a non-printing
> > > character(Ctrl@ - Ctrl_) and then substitute a printing ASCII sequence> >> > > as "^@" in its place, but it does not seem to work as I would like. Any
> > > advice would be greatly appreciated.
> > >
> > > Thank You....Eric Sand
> > >
> > >
> > Your obvious guess is to write Perl as if it were C. That's slightly better
> > than treating it as a scripting language, but there are many joys left to be
> > found!
> >> > %d\n",$in_ctr);> > > $in_ctr=0;
> > > $out_ctr=0;
> > >
> > > while ($line = <STDIN>)
> > > {
> > > chomp($line);
> > > $in_ctr ++;
> > > if ($line = s/\c@,\cA,\cB,\cC,\cD,\cE,\cF,\cG,\cH,\cI,\cJ,\cK,
> > > \cL,\cM,\cN,\cO,\cP,\cQ,\cR,\cS,\cT,\cU,\cV,\cW,
> > > \cX,\cY,\cZ,\c[,\c\,\c],\c^,\c_
> > > /^@,^A,^B,^C,^D,^E,^F,^G,^H,^I,^J,^K,
> > > ^L,^N,^N,^O,^P,^Q,^R,^S,^T,^U,^V,^W,
> > > ^X,^Y,^Z,^[,^\,^],^^,^_/)
> > > {
> > > $out_ctr ++;
> > > printf("Non-printing chars detected in: %s\n",$line);
> > > }
> > > }
> > > printf("Total records read => > %d\n",$out_ctr);> > > printf("Total records written with non-printing characters =
> >
> > I would write this as below. The first things is to *always*
> >
> > use strict;
> > use warnings;
> >
> >
> > after which you have to declare all of your variables with 'my'.
> >
> > The second is to get used to using the default $_ variable which
> > is set to the value for the current 'while(<>)' or 'for' loop
> > iteration, and is a default parameter for most built-in functions.
> >
> > Finally, in your particular case you're using the s/// (substitute)
> > operator wrongly. The first part, s/here//, is a regular expression,
> > not a list of characters. You'll need to read up on these at
> >
> > perldoc perlre
> >
> > The second part, s//here/, is a string expression which can use
> > 'captured' sequences (anything in brackets) from the first part
> > and, with the addition of the s///e (executable) qualifier can
> > also be an executable statement. Here I've used it to add 0x20
> > to the ASCII value of the control character grabbed by the regex.
> >
> > A lot of this won't make sense until you learn some more, but I
> > hope you'll agree that this code is cuter than your original?
> >
> > HTH,
> >
> > Rob
> >
> >
> >
> > use strict;
> > use warnings;
> >
> > my $in_ctr = 0;
> > my $out_ctr = 0;
> >
> > while (<>) {
> >
> > chomp;
> >
> > $in_ctr++;
> >
> > if (s/([\x00-\1F])/'^'.chr(ord($1) + 0x40)/eg) {
> > $out_ctr++;
> > printf "Non-printing chars detected in: %s\n", $_;
> > }
> > }
> >
> > printf "Total records read = %d\n", $in_ctr;
> > printf "Total records written with non-printing characters = %d\n",
> > $out_ctr;
> Rob, can you explain the details of that replace? That's pretty slick. I
> see you're adding the hex value to get to the appropriate ASCII value, but
> didn't know you could do some of that gyration inside a regex.
such a mess, but I don't see a better way. Anyway, the statement is
s/([\x00-\1F])/'^'.chr(ord($1) + 0x40)/eg
where the regex is
([\x00-\1F])
The enclosing parentheses capture the entire regex as $1 for use later
in the replacement expression or even in a later statement. Within that
is a character class [ .. ] which is simply all control characters. It's
the first 'column' of the 7-bit 128-character ASCII set with byte values
0 through 31 or 0x00 through 0x1F. It would be better expressed as
[[:cntrl:]]
which is identical but describes what you /mean/ rather than how your
machine should do it.
OK, so we've captured one control character into $1. Then comes the
replacement string, which can be an executable expression with the /e
modifier on the substitution. Note that for simple interpolation of
variables like the captured $1, $2 etc, and in fact any variable
(including arrays and hashes) in scope, there is no need for /e. It is
only necessary if there are operators or subroutines that need to
be executed to build the replacement string.
It's a mess because there is no way of relating control characters
(e.g. CR) with their alphabetic equivalents (e.g. CTRL/M) without
doing character arithmetic. And that's not what characters do in
/real/ life.
In
'^'.chr(ord($1) + 0x40)
ord($1) returns the byte value of the control character.
+ 0x40
moves that byte value from the first column (control characters) to the
third column (capital alphas)
chr()
turns that byte value back into a one-character ASCII string.
'^'.
appends a caret before that character. Hence "\cM" becomes
'^M'.
All that is left is the /g modifier, which simply replaces
all instances of the regex instead of just the first one found.
I hope this helps. It's useful for me to tie down my
programming to first principles once in a while and ask
/why/ did I write that?
Cheers guys.
Rob
Rob Dixon Guest
-
Robert Brown #9
Re: Pattern Match
Rob Dixon writes:
> Tom Kinzer wrote:Yes, it is indeed a mess, not only syntacticly, but also semantically.> I didn't think it was slick at all. In fact I was disappointed that
> it looked such a mess, but I don't see a better way.
While it might make a good teaching example to show what you can do in
a perl regex, it might not be a very good way to do what is ultimately
accomplished.
First, a regular expression pattern match is conducted to find all
chars in the string that are in the desired "special processing"
range. Note that these are each individual characters, not
substrings, so the regex match is gross overkill from a computational
complexity point of view.
Second, all that is desired is to insert a circumflex and then the
character plus a bias to make it printable.
Now if this is all that has to be done, and you want to do it to a
bunch of large files, then the way you show is a poor way to do it. A
simple C program could be written to get a character from stdin, check
it in an "if" statement to see if it is in the desired range, and then
output the circumflex followied by the biased character to stdout if
it is in the range, or else just output the character. This simple
one-char-at-a-time streaming filter approach would be considerably
simpler computationally than the method you provide.
Now if you only need to do this to massage a few lines of output in a
program with a much larger overall purpose, then perhaps your example
is the way to go.
My question is, how does perl's regex compiler handle the code you
gave? Does it optimize it to a similar level of complexity as my C
example, or does it smash it with a one-size-fits-all regular
expression engine? I know regular expressions can be highly optimized
at compile time, so this is an important question. If the regex is
sufficiently optimized, then it would always be the way to go.
Bob "Rj" Brown
--
-------- "And there came a writing to him from Elijah" [2Ch 21:12] --------
R. J. Brown III [email]rj@elilabs.com[/email] [url]http://www.elilabs.com/~rj[/url] voice 847 543-4060
Elijah Laboratories Inc. 457 Signal Lane, Grayslake IL 60030 fax 847 543-4061
----- M o d e l i n g t h e M e t h o d s o f t h e M i n d ------
Robert Brown Guest
-
Rob Dixon #10
Re: Pattern Match
Robert Brown wrote:
Thanks Robert, but I wonder if you expect us to take you seriously?>
> Rob Dixon writes:
>>> > Tom Kinzer wrote:>> > I didn't think it was slick at all. In fact I was disappointed that
> > it looked such a mess, but I don't see a better way.
> Yes, it is indeed a mess, not only syntacticly, but also semantically.
> While it might make a good teaching example to show what you can do in
> a perl regex, it might not be a very good way to do what is ultimately
> accomplished.
>
> First, a regular expression pattern match is conducted to find all
> chars in the string that are in the desired "special processing"
> range. Note that these are each individual characters, not
> substrings, so the regex match is gross overkill from a computational
> complexity point of view.
>
> Second, all that is desired is to insert a circumflex and then the
> character plus a bias to make it printable.
>
> Now if this is all that has to be done, and you want to do it to a
> bunch of large files, then the way you show is a poor way to do it. A
> simple C program could be written to get a character from stdin, check
> it in an "if" statement to see if it is in the desired range, and then
> output the circumflex followied by the biased character to stdout if
> it is in the range, or else just output the character. This simple
> one-char-at-a-time streaming filter approach would be considerably
> simpler computationally than the method you provide.
>
> Now if you only need to do this to massage a few lines of output in a
> program with a much larger overall purpose, then perhaps your example
> is the way to go.
>
> My question is, how does perl's regex compiler handle the code you
> gave? Does it optimize it to a similar level of complexity as my C
> example, or does it smash it with a one-size-fits-all regular
> expression engine? I know regular expressions can be highly optimized
> at compile time, so this is an important question. If the regex is
> sufficiently optimized, then it would always be the way to go.
In which case I'll happily reply.
Rob
Rob Dixon Guest
-
Robert Brown #11
Re: Pattern Match
Rob Dixon writes:
[ some words deleted here ... ]> Robert Brown wrote:> >
> > Rob Dixon writes:
> >> >> > > Tom Kinzer wrote:> >> > > I didn't think it was slick at all. In fact I was disappointed that
> > > it looked such a mess, but I don't see a better way.
> > Yes, it is indeed a mess, not only syntacticly, but also semantically.
> > While it might make a good teaching example to show what you can do in
> > a perl regex, it might not be a very good way to do what is ultimately
> > accomplished.
Yes! Please take my request seriously. I hope you can show me that>> > My question is, how does perl's regex compiler handle the code you
> > gave? Does it optimize it to a similar level of complexity as my C
> > example, or does it smash it with a one-size-fits-all regular
> > expression engine? I know regular expressions can be highly optimized
> > at compile time, so this is an important question. If the regex is
> > sufficiently optimized, then it would always be the way to go.
> Thanks Robert, but I wonder if you expect us to take you seriously?
> In which case I'll happily reply.
>
> Rob
the regex approach you used pays no penalty other than perhaps a few
extra miliseconds of compilation time, and that it executes very
efficiently. That is what I want to see. I know it *CAN*
(theoretically) be done; I am just wondering if it indeed has been
done.
Rj
Robert Brown Guest
-
Rob Dixon #12
Re: Pattern Match
Robert Brown wrote:
Yes, I'd gladly trade in my Honda 750cc for a lightcycle: I know>
> Rob Dixon writes:>> > Robert Brown wrote:> > >
> > > Rob Dixon writes:
> > >
> > > > Tom Kinzer wrote:
> > >
> > > > I didn't think it was slick at all. In fact I was disappointed that
> > > > it looked such a mess, but I don't see a better way.
> > >
> > > Yes, it is indeed a mess, not only syntacticly, but also semantically.
> > > While it might make a good teaching example to show what you can do in
> > > a perl regex, it might not be a very good way to do what is ultimately
> > > accomplished.
> [ some words deleted here ... ]
>>> >> > > My question is, how does perl's regex compiler handle the code you
> > > gave? Does it optimize it to a similar level of complexity as my C
> > > example, or does it smash it with a one-size-fits-all regular
> > > expression engine? I know regular expressions can be highly optimized
> > > at compile time, so this is an important question. If the regex is
> > > sufficiently optimized, then it would always be the way to go.
> > Thanks Robert, but I wonder if you expect us to take you seriously?
> > In which case I'll happily reply.
> >
> > Rob
> Yes! Please take my request seriously. I hope you can show me that
> the regex approach you used pays no penalty other than perhaps a few
> extra miliseconds of compilation time, and that it executes very
> efficiently. That is what I want to see. I know it *CAN*
> (theoretically) be done; I am just wondering if it indeed has been
> done.
it *CAN* (theoretically) be done.
I'm sure you have something useful to say. This seems such a waste of
your effort.
Rob
Rob Dixon Guest
-
John W. Krahn #13
Re: Pattern Match
Eric Sand wrote:
Hello,>
> Hi All,
Welcome. :-)> I am very new to Perl,
"Thinking in Perl" may take a while but it is not your grandfather's> but I sense a great adventure ahead after just
> programming with Cobol, Pascal, and C over the last umpteen years.
programming language (sorry COBOL.)
Your idea of non-printing seems to conflict with industry standards as> I have
> written a perl script where I am trying to detect a non-printing
> character(Ctrl@ - Ctrl_)
CtrlG - CtrlM are all printable. Also you are using perl's standard
readline and chomp()ing the input so you are not converting the CtrlJ
character at all.
> and then substitute a printing ASCII sequence such
> as "^@" in its place, but it does not seem to work as I would like. Any
> advice would be greatly appreciated.
use warnings;
use strict;
Whitespace is free and makes your code more readable and maintainable.> $in_ctr=0;
> $out_ctr=0;
my $in_ctr = 0;
my $out_ctr = 0;
As Rob pointed out, this is not the correct way to use the substitution> while ($line = <STDIN>)
> {
> chomp($line);
> $in_ctr ++;
> if ($line = s/\c@,\cA,\cB,\cC,\cD,\cE,\cF,\cG,\cH,\cI,\cJ,\cK,
> \cL,\cM,\cN,\cO,\cP,\cQ,\cR,\cS,\cT,\cU,\cV,\cW,
> \cX,\cY,\cZ,\c[,\c\,\c],\c^,\c_
> /^@,^A,^B,^C,^D,^E,^F,^G,^H,^I,^J,^K,
> ^L,^N,^N,^O,^P,^Q,^R,^S,^T,^U,^V,^W,
> ^X,^Y,^Z,^[,^\,^],^^,^_/)
operator (see below.)
You shouldn't use printf unless you really have to and in this case you> {
> $out_ctr ++;
> printf("Non-printing chars detected in: %s\n",$line);
don't really have to.
print "Non-printing chars detected in: $line\n";
print "Total records read = $in_ctr\n";> }
> }
> printf("Total records read = %d\n",$in_ctr);
> printf("Total records written with non-printing characters =
> %d\n",$out_ctr);
print "Total records written with non-printing characters = $out_ctr\n";
I would probably write it like this:
use warnings;
use strict;
my $out_ctr = 0;
while ( <STDIN> ) {
next unless s/([[:cntrl:]])/'^' . ( $1 | "\x40" )/eg;
$out_ctr++;
print "Non-printing chars detected in: $_\n";
}
print "Total records read = $.\n";
print "Total records written with non-printing characters = $out_ctr\n";
__END__
John
--
use Perl;
program
fulfillment
John W. Krahn Guest
-
John W. Krahn #14
Re: Pattern Match
Rob Dixon wrote:
Oops. That matches the characters "\0", "\1" and 'F'. It should be>
> I didn't think it was slick at all. In fact I was disappointed that it looked
> such a mess, but I don't see a better way. Anyway, the statement is
>
> s/([\x00-\1F])/'^'.chr(ord($1) + 0x40)/eg
>
> where the regex is
>
> ([\x00-\1F])
([\x00-\x1F])
:-)
John
--
use Perl;
program
fulfillment
John W. Krahn Guest
-
Robert Brown #15
Re: Pattern Match
Rob Dixon writes:
I think we are failing to communicate. What I am asking is:> I'm sure you have something useful to say. This seems such a waste of
> your effort.
>
> Rob
"Does the regular expression mechanism in perl optimize regular
expressions such as the one you used earlier in this thread so that
the execution overhead is nearly as good as the C approach I outlined
earlier in this thread? In other words, for the problem stated
earlier, does o(C) = o(perl)?
Can I really use regular expressions as my main tool for scanning and
modifying strings and expect to get speeds comparable to what I would
get with hand tailored code? I hope so, because that would be
wonderful.
Robert Brown Guest
-
Casey West #16
Re: Pattern Match
It was Tuesday, December 09, 2003 when Robert Brown took the soap box, saying:
: Rob Dixon writes:
: > I'm sure you have something useful to say. This seems such a waste of
: > your effort.
: >
: > Rob
:
: I think we are failing to communicate. What I am asking is:
Thanks for the clarification, this was getting a bit out of hand. :-)
: "Does the regular expression mechanism in perl optimize regular
: expressions such as the one you used earlier in this thread so that
: the execution overhead is nearly as good as the C approach I outlined
: earlier in this thread? In other words, for the problem stated
: earlier, does o(C) = o(perl)?
The answer is, C almost always going to be much faster almost all the
time, YMMV. Really the only way to tell is with tests and benchmarks,
but you can almost always bet on C.
: Can I really use regular expressions as my main tool for scanning and
: modifying strings and expect to get speeds comparable to what I would
: get with hand tailored code? I hope so, because that would be
: wonderful.
You could, however, make very good use of some builtin Perl functions
like substr(), length(), pos(), index(), rindex(), study(), and so
on. No regular expressions, but you don't always need them.
Casey West
--
I'd rather listen to Newton than to Mundie. He may have been dead for
almost three hundred years, but despite that he stinks up the room
less.
-- Linus Torvalds
Casey West Guest
-
Robert Brown #17
Re: Pattern Match
Casey West writes:
Sorry again for my confusing way of expressing myself. Although I> : "Does the regular expression mechanism in perl optimize regular
> : expressions such as the one you used earlier in this thread so that
> : the execution overhead is nearly as good as the C approach I outlined
> : earlier in this thread? In other words, for the problem stated
> : earlier, does o(C) = o(perl)?
>
> The answer is, C almost always going to be much faster almost all the
> time, YMMV. Really the only way to tell is with tests and benchmarks,
> but you can almost always bet on C.
wrote my example in C, that was because I am a novice perl programmer,
but an experienced C programmer, so I expressed my algorithm in C.
The idea was to compare the execution effeciency of a perl regular
expression approach to a less syntacticly compact algorithmic approach
using loops and conditionals, still written in perl, to edit the
string. I just used C so you all would not beat me up over perl
syntax details instead of answering the real question.
Is perl going to be comparably efficient whichever way you code it, or
is the explicit test and loop approach usually going to be faster for
simple jobs? I want to know when to use the regex approach and when
not to.
Yes, I realize that much of the time it is not computer time one needs
to optimize for, but programmer time, especially for "throw-away"
hacks that only take miliseconds to run anyway.
For the big jobs, one must be more careful. For example, right now I
am working on a set of scripts to implement a production backup
server. A live test run mucks around with about 100 GB of files over
a network that I wish was faster, stuffing those files safely into a
"negative time" image on a RAID5 array, and takes about 6 hours to
run. It is a good idea to try to make such long jobs run quicker!
Robert Brown Guest
-
mcdavis941@netscape.net #18
Re: Pattern Match
Since we're now talking about performance issues, somebody should say something about precompiling the regular expression, when you can, with either /o or qr(). I had a process's running time go from 2min 45sec to just under 24sec simply by using qr on the relevant regular expressions.
Robert Brown <eli@xnet.com> wrote:
__________________________________________________ ________________>Rob Dixon writes:>> > I'm sure you have something useful to say. This seems such a waste of
> > your effort.
> >
> > Rob
>I think we are failing to communicate. *What I am asking is:
>
>"Does the regular expression mechanism in perl optimize regular
>expressions such as the one you used earlier in this thread so that
>the execution overhead is nearly as good as the C approach I outlined
>earlier in this thread? *In other words, for the problem stated
>earlier, does o(C) = o(perl)? *
>
>Can I really use regular expressions as my main tool for scanning and
>modifying strings and expect to get speeds comparable to what I would
>get with hand tailored code? *I hope so, because that would be
>wonderful.
>
>--
>To unsubscribe, e-mail: [email]beginners-unsubscribe@perl.org[/email]
>For additional commands, e-mail: [email]beginners-help@perl.org[/email]
><http://learn.perl.org/> <http://learn.perl.org/first-response>
>
>
>
McAfee VirusScan Online from the Netscape Network.
Comprehensive protection for your entire computer. Get your free trial today!
[url]http://channels.netscape.com/ns/computing/mcafee/index.jsp?promo=393397[/url]
Get AOL Instant Messenger 5.1 free of charge. Download Now!
[url]http://aim.aol.com/aimnew/Aim/register.adp?promo=380455[/url]
mcdavis941@netscape.net Guest
-
Rob Dixon #19
Re: Pattern Match
Before I finally burst my cyanide capsule, may I.. ?
Rj wrote:What is a syntactic mess? And, even more obscurely, what is a>
> Rob Dixon writes:>> >
> > I didn't think it was slick at all. In fact I was
> > disappointed that it looked such a mess, but I don't see
> > a better way.
> Yes, it is indeed a mess, not only syntacticly, but also
> semantically.
semantic mess?
The stark realisation is that an infinite majority of problems> While it might make a good teaching example to show what you
> can do in a perl regex, it might not be a very good way to do
> what is ultimately accomplished.
have no solution at all. This is a Perl newsgroup.
I've never before seen a software solution reverse-engineered as> First, a regular expression pattern match is conducted to find
> all chars in the string that are in the desired "special
> processing" range. Note that these are each individual
> characters, not substrings, so the regex match is gross
> overkill from a computational complexity point of view.
>
> Second, all that is desired is to insert a circumflex and then
> the character plus a bias to make it printable.
far as the documentation plus obfuscations! As far as possible a
piece of software should be a description of what is to be done:
that is what compilers/interpreters/assemblers/shell languages
are for. Ideally what I should be able to write is:
replace all control characters with their printable
equivalents
It is only the rigour of programming languages that prevents
this. And why most companies still employ people.
Now> Now if this is all that has to be done, and you want to do it
> to a bunch of large files, then the way you show is a poor way
> to do it.
"Yes, it is indeed a mess, not only syntacticly, but also
semantically."
and
"the way you show is a poor way to do it"
is downright rude. Especially without an alternative option.
Do you want to be taken seriously or what?
I wrote an algorithm. If you have a problem with how well
(in whatever sense) a computer executes that algorithm then you
have an issue with the originators of the language and its
implementors. I for one think that Perl is one of the best-
conceived languages and certainly the best choice for any stand-
alone program.
How are you so sure that that's not how my algorithm is> A simple C program could be written to get a character from
> stdin, check it in an "if" statement to see if it is in the
> desired range, and then output the circumflex followied by the
> biased character to stdout if it is in the range, or else just
> output the character. This simple one-char-at-a-time streaming
> filter approach would be considerably simpler computationally
> than the method you provide.
implemented by the compiler?
Or perhaps it's the best way to go anyway?> Now if you only need to do this to massage a few lines of
> output in a program with a much larger overall purpose, then
> perhaps your example is the way to go.
'Sufficiently'? Why do you need to know? I know very well that I> My question is, how does perl's regex compiler handle the code
> you gave? Does it optimize it to a similar level of
> complexity as my C example, or does it smash it with a one-
> size-fits-all regular expression engine? I know regular
> expressions can be highly optimized at compile time, so this
> is an important question. If the regex is sufficiently
> optimized, then it would always be the way to go.
can write something in Intel assembler that will perform far
faster than your C program. But I don't need to. I still don't
understand your point. Just how fast do you need this thing to
go? Why not just put stripes on it?
Rob
BTW have you read the context of your sig?
Rj wrote:2 Chronicles 21:12,13>
> -------- "And there came a writing to him from Elijah" [2Ch 21:12] --------
> R. J. Brown III [email]rj@elilabs.com[/email] [url]http://www.elilabs.com/~rj[/url] voice 847 543-4060
> Elijah Laboratories Inc. 457 Signal Lane, Grayslake IL 60030 fax 847 543-4061
> ----- M o d e l i n g t h e M e t h o d s o f t h e M i n d ------
Jehoram received a letter from Elijah the prophet, which said:
"This is what the LORD, the God of your father David, says: 'You
have not walked in the ways of your father Jehoshaphat or of Asa
king of Judah. But you have walked in the ways of the kings of
Israel, and you have led Judah and the people of Jerusalem to
prostitute themselves, just as the house of Ahab did. You have
also murdered your own brothers, members of your father's house,
men who were better than you.
Rob Dixon Guest
-
Jenda Krynicky #20
Re: Pattern Match
From: Robert Brown <eli@xnet.com>
1. Perl builtins and especialy the regular expression engine is> Casey West writes:> : expressions such as the one you used earlier in this thread so that> > : "Does the regular expression mechanism in perl optimize regular >> outlined > : earlier in this thread? In other words, for the problem> > : the execution overhead is nearly as good as the C approach I
> stated > : earlier, does o(C) = o(perl)? > > The answer is, C almost
> always going to be much faster almost all the > time, YMMV. Really
> the only way to tell is with tests and benchmarks, > but you can
> almost always bet on C.
>
> Sorry again for my confusing way of expressing myself. Although I
> wrote my example in C, that was because I am a novice perl programmer,
> but an experienced C programmer, so I expressed my algorithm in C.
>
> The idea was to compare the execution effeciency of a perl regular
> expression approach to a less syntacticly compact algorithmic approach
> using loops and conditionals, still written in perl, to edit the
> string. I just used C so you all would not beat me up over perl
> syntax details instead of answering the real question.
>
> Is perl going to be comparably efficient whichever way you code it, or
> is the explicit test and loop approach usually going to be faster for
> simple jobs? I want to know when to use the regex approach and when
> not to.
heavily optimized. So it might very well be quicker to use a regexp
from Perl than to implement the same stuff in C. Unless you spend a
lot of time tweaking the code.
2. One regexp (assuming its created well) will almost always be
quicker than several loops and ifs in Perl.
While you should not use a regexp where the "normal" functions
suffice, you should not go into great lengths implementing something
that would be simple as a regexp. It'll be harder to maintain and
most probably slower.
3. If you really need to know which solution is quicker
use Benchmark;
Jenda
===== [email]Jenda@Krynicky.cz[/email] === [url]http://Jenda.Krynicky.cz[/url] =====
When it comes to wine, women and song, wizards are allowed
to get drunk and croon as much as they like.
-- Terry Pratchett in Sourcery
Jenda Krynicky Guest



Reply With Quote

