Ask a Question related to PERL Beginners, Design and Development.
-
Webmaster@Oldwest.Org #1
quick re help
Hi everyone
I am pretty new to regex's, so I was happy when my text wrapping
expression worked - for the most part.
It messes up when I need to wrap lines with \n that don't end in a space.
If there is no space, it places last word on its own line before it
should wrap. Otherwise it double \ns the line.
I can't figure out why.
can anyone help me out?
sub quickWrap {
my $data = @_[0];
my $wrapAt = 75;
if (scalar @_ > 1) {
$wrapAt = @_[1];
}
my $wrappedData ="";
while ($data =~ /[^|\n][^\n]{$wrapAt,}?[ |$|\n]/) {
$data =~ /([^|\n])([^\n]{1,$wrapAt})( )([\s|\S]*$)/;
$wrappedData .= "$`$1$2\n";
$data = "$4";
}
return "$wrappedData$data";
}
thanks!
Webmaster@Oldwest.Org Guest
-
XML Quick help
I am working on a simple flash interface, ussing the tutorial below. http://www.macromediahelp.com/flash/simple_flash_and_xml_sample/ It is all... -
I need Help Quick! - - PLEASE
Recently I used the CF7 updater and the report builder 'add-on' In putting in place the updater I could no longer access my admin pages to update... -
I need some help quick...
After making a graphic for screen printing, I will PDF it out to proof my seperations. This works great. I then need to make a "drawing" that shows... -
*Please Help Quick
Does anyone know of a way to restore a works processor document back to the way it was when I working with it before? A user on my computer had... -
QUICK HELP PLEASE
i would like to know,..how or if its possible to change the title of a pop up browser window i created,...i know how to do it through the "PAGE... -
Jeff 'Japhy' Pinyan #2
Re: quick re help
On Aug 13, [email]webmaster@oldwest.org[/email] said:
>It messes up when I need to wrap lines with \n that don't end in a space.
>If there is no space, it places last word on its own line before it
>should wrap. Otherwise it double \ns the line.You shouldn't use an array slice where you mean to use a single array>sub quickWrap {
> my $data = @_[0];
element.
my $data = $_[0];
Again, @_[1] should be $_[1]. And the use of scalar() here is redundant.> my $wrapAt = 75;
> if (scalar @_ > 1) {
> $wrapAt = @_[1];
> }
if (@_ > 1) { $wrapAt = $_[1] }
We could have written those first few lines in many different ways. Here
are two ways I might have written it:
my $data = shift;
my $wrap_at = @_ ? shift : 75;
or
my $data = $_[0];
my $wrap_at = @_ > 1 ? $_[1] : 75;
It's not technically required to give $wrappedData a value of "", since> my $wrappedData ="";
you're adding to it using the .= operator, which is nice enough not to
complain if the variable started out undef. Just a little note.
Something tells me you're not sure what a character class does. A> while ($data =~ /[^|\n][^\n]{$wrapAt,}?[ |$|\n]/) {
character class is for CHARACTERS. Therefore, you don't use | in it. The
class [a|b|c] is the same as [|abc] -- that is, it matches an 'a', an 'b',
a 'c', or a '|'. Also, you can't match "beginning of line" or "end of
line" in a regex, like you think you're doing with [^|\n] and [ |$|\n].
First of all, the ^ and $ in a regex don't mean the same thing inside a
character class. Second, ^ and $ don't match characters, they match
locations. Third, the ^, as the first character of a class, means "match
everything except ...".
So. Let's give your regex a fixer-upper. Instead of [^|\n], I have a
feeling you'll want either (?:^|\n) which matches ^ or \n, and doesn't
capture to any $DIGIT variable; or maybe you can use ^ with the /m
modifier on the regex. Instead of [^\n], you can just use . -- that's
what it was made for. And instead of [ |$|\n], you'll want (?:\s|$) I
think.
But I think you're doing MUCH more work than needed. We'll see.
This regex looks familiar. I'm going to suggest a big change in a bit.> $data =~ /([^|\n])([^\n]{1,$wrapAt})( )([\s|\S]*$)/;
Oh, and [\s|\S], which could be [\s\S], is kind of awkward.
EWW. DON'T USE $`. It's terrible.> $wrappedData .= "$`$1$2\n";
You don't need those quotes.> $data = "$4";
Ok, here's my idea: instead of matching text and putting it in a new> }
> return "$wrappedData$data";
>}
string, why not CHANGE the string we're working on as we match it? We can
do that using a substitution, the s/// operation.
We want to match UP TO $wrap_at characters, as many as possible, and add a
newline after them, SO LONG as it's in the place of a space. Here's a
regex I think will do the job for you:
sub quick_wrap { # I use the word_word_word style, not wordWordWord
my $str = shift;
my $wrap_at = @_ ? shift : 60;
$str =~ s{(.{1,$wrap_at})\s}{$1\n}g;
return $str;
}
The regex matches between 1 and $wrap_at characters (trying to match the
most possible) that are followed by a space. It replaces this with the
text it matched (and captured to $1) followed by a newline. Let me know
if this does what you expected.
--
Jeff "japhy" Pinyan [email]japhy@pobox.com[/email] [url]http://www.pobox.com/~japhy/[/url]
RPI Acacia brother #734 [url]http://www.perlmonks.org/[/url] [url]http://www.cpan.org/[/url]
<stu> what does y/// stand for? <tenderpuss> why, yansliterate of course.
[ I'm looking for programming work. If you like my work, let me know. ]
Jeff 'Japhy' Pinyan Guest
-
Webmaster@Oldwest.Org #3
Re: quick re help
Jeff 'Japhy' Pinyan wrote:
Thanks for catching that, I should have really seen that one.> On Aug 13, [email]webmaster@oldwest.org[/email] said:
>>>>sub quickWrap {
>> my $data = @_[0];
>
> You shouldn't use an array slice where you mean to use a single array
> element.
>
$times_to_reread_my_code_before_posting_to_list++;
I like that.> my $data = shift;
> my $wrap_at = @_ ? shift : 75;
>
Thanks for the correction character classes *runs to fix about half a> Something tells me you're not sure what a character class does. A
> character class is for CHARACTERS. Therefore, you don't use | in it. The
dozen regexes*
what is less awkward than [\s|\S] for 'match anything?'> This regex looks familiar. I'm going to suggest a big change in a
> bit.
> Oh, and [\s|\S], which could be [\s\S], is kind of awkward.
Okay, is that because it is slow and makes the rest of the regular>
> EWW. DON'T USE $`. It's terrible.
>
expressions afterwards run slowly? (I saw something about that in the
perlre document)
Is it as bad to use something like '(match anything)' before the main
expression, and using $1 in place of $` when it's useful?
That is a better approach, I had given up on that when I couldn't> Ok, here's my idea: instead of matching text and putting it in a new
> string, why not CHANGE the string we're working on as we match it? We can
> do that using a substitution, the s/// operation.
>
understand why it was failing, and thought I could follow the logic
better if I broke it down into a match in a loop.
That is exactly what I was trying to do, and that's a far, far more> We want to match UP TO $wrap_at characters, as many as possible, and add a
> newline after them, SO LONG as it's in the place of a space. Here's a
> regex I think will do the job for you:
>
> sub quick_wrap { # I use the word_word_word style, not wordWordWord
> my $str = shift;
> my $wrap_at = @_ ? shift : 60;
>
> $str =~ s{(.{1,$wrap_at})\s}{$1\n}g;
>
> return $str;
> }
>
> The regex matches between 1 and $wrap_at characters (trying to match the
> most possible) that are followed by a space. It replaces this with the
> text it matched (and captured to $1) followed by a newline. Let me know
> if this does what you expected.
>
elegant way to do it.
I think I'll review all my material again on regexes. Are there any
good books you recommend on how to use and think in regexes?
Thanks again for your help, that has really really helped.
Webmaster@Oldwest.Org Guest
-
James Edward Gray II #4
Re: quick re help
On Wednesday, August 13, 2003, at 04:22 PM, [email]webmaster@oldwest.org[/email]
wrote:
Just to add my two cents, I like:> Jeff 'Japhy' Pinyan wrote:>>> On Aug 13, [email]webmaster@oldwest.org[/email] said:
>> my $wrap_at = @_ ? shift : 75;
> I like that.
my $wrap_at = shift || 75;
James Gray
James Edward Gray II Guest
-
Robert J Taylor #5
RE: quick re help
[email]webmaster@oldwest.org[/email] inquired:
>> This regex looks familiar. I'm going to suggest a big change in a
>> bit.
>> Oh, and [\s|\S], which could be [\s\S], is kind of awkward...> what is less awkward than [\s|\S] for 'match anything?'
Yes ->.<-
Dot, period, point, et al, is the universal match "something" symbol. So,
m'.*$' matches everthing on a line. If you want to match a period you can
either escape it: \. or bracket it [.]. Escaping is better for simple
matching.
HTH,
Robert Taylor
Robert J Taylor Guest
-
Alan Perry #6
RE: quick re help
Robert J Taylor wrote:
Almost... From "perldoc perlretut":>
>webmaster@oldwest.org inquired:
>>> >> This regex looks familiar. I'm going to suggest a big change in a
> >> bit.
> >> Oh, and [\s|\S], which could be [\s\S], is kind of awkward.>> > what is less awkward than [\s|\S] for 'match anything?'
>.
>
>Yes ->.<-
>
>Dot, period, point, et al, is the universal match "something" symbol. So,
>m'.*$' matches everthing on a line. If you want to match a period you can
>either escape it: \. or bracket it [.]. Escaping is better for simple
>matching.
\s is a whitespace character and represents [\ \t\r\n\f]
\S is a negated \s; it represents any non-whitespace character [^\s]
The period '.' matches any character but "\n"
So, /[\s\S]/ would match a "\n", while /./ would not. The equivalent of
/[\s\S]/, using period notation, would be /[.\n]/
Alan
Alan Perry Guest
-
Jeff 'Japhy' Pinyan #7
Re: quick re help
On Aug 13, [email]webmaster@oldwest.org[/email] said:
The simpler-looking>Jeff 'Japhy' Pinyan wrote:>>> On Aug 13, [email]webmaster@oldwest.org[/email] said:
>>
>> my $data = shift;
>> my $wrap_at = @_ ? shift : 75;
>>
>I like that.
my $wrap_at = shift || 75;
has also been proposed. The only reason I didn't use that is because, in
case you were ever in a situation where 0 was a valid value for your
variable, you wouldn't want to use that || operator.
Well, for certain values of "less awkward", /(?s:.)/, which is the .>what is less awkward than [\s|\S] for 'match anything?'
metacharacter with the /s switch turned on -- that way it matches any
character including newlines. I think someone suggested [.\n], but that
is not correct, because . just means "." inside a character class.
Another way to do it, if you are certain your input will have no
multi-byte characters, is to use \C, which matches a single byte.
Yes. When Perl compiles your program, it makes note of any use of $`, $&,>>> EWW. DON'T USE $`. It's terrible.
>Okay, is that because it is slow and makes the rest of the regular
>expressions afterwards run slowly? (I saw something about that in the
>perlre document)
or $'. If it sees you use it ONCE, ANYWHERE, it will make each regex
prepare their values for you. Not cool. What is cool is that, even
though $1, $2, etc. are the same way, they are provided only on a
per-regex basis.
Well, if you're using a recent enough Perl (5.6+), you have access to the>Is it as bad to use something like '(match anything)' before the main
>expression, and using $1 in place of $` when it's useful?
@- and @+ arrays, which hold offsets related to your last successful
pattern match. $-[0] holds the offset in your string where the match
started, so you could do
my $pre = substr($str, 0, $-[0]);
to get the equivalent of $`. See perlvar.
"Mastering Regular Expressions", published by O'Reilly.>I think I'll review all my material again on regexes. Are there any
>good books you recommend on how to use and think in regexes?
--
Jeff "japhy" Pinyan [email]japhy@pobox.com[/email] [url]http://www.pobox.com/~japhy/[/url]
RPI Acacia brother #734 [url]http://www.perlmonks.org/[/url] [url]http://www.cpan.org/[/url]
<stu> what does y/// stand for? <tenderpuss> why, yansliterate of course.
[ I'm looking for programming work. If you like my work, let me know. ]
Jeff 'Japhy' Pinyan Guest
-
Jeff 'Japhy' Pinyan #8
RE: quick re help
On Aug 14, Perry, Alan said:
Not so much; the . in a character class matches just itself.>So, /[\s\S]/ would match a "\n", while /./ would not. The equivalent of
>/[\s\S]/, using period notation, would be /[.\n]/
--
Jeff "japhy" Pinyan [email]japhy@pobox.com[/email] [url]http://www.pobox.com/~japhy/[/url]
RPI Acacia brother #734 [url]http://www.perlmonks.org/[/url] [url]http://www.cpan.org/[/url]
<stu> what does y/// stand for? <tenderpuss> why, yansliterate of course.
[ I'm looking for programming work. If you like my work, let me know. ]
Jeff 'Japhy' Pinyan Guest
-
Alan Perry #9
RE: quick re help
Jeff 'japhy' Pinyan wrote:
You are correct, I forgot about that. You could use /.|\n/ ... That should>
>On Aug 14, Perry, Alan said:
>>>>So, /[\s\S]/ would match a "\n", while /./ would not. The equivalent of
>>/[\s\S]/, using period notation, would be /[.\n]/
>Not so much; the . in a character class matches just itself.
work.
Alan Perry Guest
-
Robert J Taylor #10
RE: quick re help
> > > what is less awkward than [\s|\S] for 'match anything?'
So,> >
> >.
> >
> >Yes ->.<-
> >
> >Dot, period, point, et al, is the universal match "something" symbol.can> >m'.*$' matches everthing on a line. If you want to match a period youThanks! I appreciate the clarification.>> >either escape it: \. or bracket it [.]. Escaping is better for simple
> >matching.
> Almost... From "perldoc perlretut":
>
> \s is a whitespace character and represents [\ \t\r\n\f]
> \S is a negated \s; it represents any non-whitespace character [^\s]
> The period '.' matches any character but "\n"
>
> So, /[\s\S]/ would match a "\n", while /./ would not. The equivalent of
> /[\s\S]/, using period notation, would be /[.\n]/
>
> Alan
Robert Taylor
Robert J Taylor Guest



Reply With Quote

