Ask a Question related to PERL Beginners, Design and Development.
-
Öznur taþtan #1
all matches of a regex
Hi,
I have been trying to solve a problem which is about to drive me crazy.
May be some one know the answer(hopefully:)
I want to get all macthes of a pattern in a string including the overlaping ones.
For example
the string is "xHxxHyyKzDt"
and the pattern is /^(.*)H(.*)K(.*)D(.*)$/
so in one round of match $1=x $2=xxHyy $3=z $4=t
in another $1=xHxx $2=yy $3=x $4=t
while ($sequence=~/$pattern/g )
doesn't work I think becaue the matches are overlapping
while ($sequence=~/(?=$pattern)/g )
also doesn't work
I would really appreciate if someone can help.
Thanks.
oznur
Öznur taþtan Guest
-
Querying data that matches in two different tables
The code is attached. Basically there are two tables, contact and recruiter. When you initially add someone to contact, it also adds certain... -
Ruby idiom for all matches in a string
What's the ruby idiom for all the matches in a string. x="ABA ABBBA CBBA" /A(B+)/.match(x) Ultimately, I want to get Thanks -- David... -
How to find grid's row that matches row in dataset
how to set SelectedIndex of the grid to the row that matches a datarow found in the underlying dataset (besides looping thru all rows and comparing... -
Named matches for regular expressions (was: Specification of Ruby regex?)
On Wed, 27 Aug 2003 04:29:22 +0900, Wesley J. Landaker wrote: No. See and following. ----------- list of all captures Version 1.9.2 ... -
convert string to <> ( multipul matches)
This is a question i'm sure has been asked before but i can't see to dig a a answer. say your recieving a text stream: contained in $stream. (... -
Jan Eden #2
Re: all matches of a regex
Öznur Taþtan wrote:
I am not sure what concept your are referring to by "round", but you will never get the first result in any "round": Your quantifiers are greedy, so the first pair of brackets will always try to match as many characters as possible, as long as they are followed by H. So $1 will always be "xHxx", given your example string, $2 will be "yy" etc.>Hi,
>I have been trying to solve a problem which is about to drive me crazy.
>May be some one know the answer(hopefully:)
>
>I want to get all macthes of a pattern in a string including the overlaping
>ones.
>For example
>the string is "xHxxHyyKzDt"
>and the pattern is /^(.*)H(.*)K(.*)D(.*)$/
>
>so in one round of match $1=x $2=xxHyy $3=z $4=t
>in another $1=xHxx $2=yy $3=x $4=t
>
Your pattern assumes three capital letters in a string as a kind of delimiter and will not be able to use more than one "H" as the first delimiter.
If the only difference is more than one "H" delimiter in all of your strings, you could try to run two different pattern, the second one using a non-greedy quantifier for your first grouping parentheses:
/^(.*?)H(.*)K(.*)D(.*)$/
Could you explain a little more detailed what your are trying to achieve? Maybe there's another way to do it.
HTH,
Jan
--
There are 10 kinds of people: those who understand binary, and those who don't
Jan Eden Guest
-
Öznur tastan #3
Re: all matches of a regex
I didn't hope that the reply will be that soon:)
Indeed I know the greedy and the non-greedy rules, which is the source of
the problem: /
I need a more general solution.
What I am trying to do is to extract all sets of combinations
when I a string matches some patterns in the example I give first pattern is
H the second is K and th third D. I put them in the single
/^(.*?)H(.*)K(.*)D(.*)$/ to be able to retrieve the substrings in between.
more generall they can be more looser patterns
for ex [HED] first
and [KL]{3 }second
I want to divide the sequence into substrings at the sites where the
patterns match in all combinations and retrieve the substrings between the
matched sites of the two patterns.
I hope that was more clearer.
Thanks
oznur
----- Original Message -----
From: "Jan Eden" <lists@jan-eden.de>
To: "Öznur Tastan" <oznurtastan@su.sabanciuniv.edu>; "Perl Lists"
<beginners@perl.org>
Sent: Sunday, February 15, 2004 2:19 PM
Subject: Re: all matches of a regex
overlaping>
> Öznur Taþtan wrote:
>> >Hi,
> >I have been trying to solve a problem which is about to drive me crazy.
> >May be some one know the answer(hopefully:)
> >
> >I want to get all macthes of a pattern in a string including thenever get the first result in any "round": Your quantifiers are greedy, so> I am not sure what concept your are referring to by "round", but you will> >ones.
> >For example
> >the string is "xHxxHyyKzDt"
> >and the pattern is /^(.*)H(.*)K(.*)D(.*)$/
> >
> >so in one round of match $1=x $2=xxHyy $3=z $4=t
> >in another $1=xHxx $2=yy $3=x $4=t
> >
the first pair of brackets will always try to match as many characters as
possible, as long as they are followed by H. So $1 will always be "xHxx",
given your example string, $2 will be "yy" etc.delimiter and will not be able to use more than one "H" as the first>
> Your pattern assumes three capital letters in a string as a kind of
delimiter.strings, you could try to run two different pattern, the second one using a>
> If the only difference is more than one "H" delimiter in all of your
non-greedy quantifier for your first grouping parentheses:Maybe there's another way to do it.>
> /^(.*?)H(.*)K(.*)D(.*)$/
>
> Could you explain a little more detailed what your are trying to achieve?don't>
> HTH,
>
> Jan
> --
> There are 10 kinds of people: those who understand binary, and those who
Öznur tastan Guest
-
Rob Dixon #4
Re: all matches of a regex
Öznur tastan wrote:
Hi Öznur.>
> Hi,
> I have been trying to solve a problem which is about to drive me crazy.
> May be some one know the answer(hopefully:)
>
> I want to get all macthes of a pattern in a string including the overlaping ones.
> For example
> the string is "xHxxHyyKzDt"
> and the pattern is /^(.*)H(.*)K(.*)D(.*)$/
>
> so in one round of match $1=x $2=xxHyy $3=z $4=t
> in another $1=xHxx $2=yy $3=x $4=t
>
>
> while ($sequence=~/$pattern/g )
> doesn't work I think becaue the matches are overlapping
>
> while ($sequence=~/(?=$pattern)/g )
> also doesn't work
>
The problem is that wildcards in regexes will match either the maximum
number of characters for a match to work (.*) or the mimumum (.*?) and
nothing in between. The only way I can think of to do this is to
put an explicit count on your first field and try all possible values,
like the program below. Others are likely to come up with something
neater.
HTH,
Rob
use strict;
use warnings;;
my $sequence = 'xHxxHyyKzDt';
foreach my $n (1 .. length $sequence) {
next unless $sequence =~ /^(.{$n})H(.+)K(.+)D(.+)/;
printf "\$1 = %-6s", $1;
printf "\$2 = %-6s", $2;
printf "\$3 = %-6s", $3;
printf "\$4 = %-6s", $4;
print "\n\n";
}
**OUTPUT
$1 = x $2 = xxHyy $3 = z $4 = t
$1 = xHxx $2 = yy $3 = z $4 = t
Rob Dixon Guest
-
Öznur taþtan #5
Re: all matches of a regex
wow! creative!
i think i can modify this for the general case.
Thanks
oznur
----- Original Message -----
From: "Rob Dixon" <rob@dixon.port995.com>
To: "Öznur taþtan" <oznurtastan@su.sabanciuniv.edu>
Sent: Sunday, February 15, 2004 2:52 PM
Subject: Re: all matches of a regex
overlaping ones.> Öznur tastan wrote:> >
> > Hi,
> > I have been trying to solve a problem which is about to drive me crazy.
> > May be some one know the answer(hopefully:)
> >
> > I want to get all macthes of a pattern in a string including the>> > For example
> > the string is "xHxxHyyKzDt"
> > and the pattern is /^(.*)H(.*)K(.*)D(.*)$/
> >
> > so in one round of match $1=x $2=xxHyy $3=z $4=t
> > in another $1=xHxx $2=yy $3=x $4=t
> >
> >
> > while ($sequence=~/$pattern/g )
> > doesn't work I think becaue the matches are overlapping
> >
> > while ($sequence=~/(?=$pattern)/g )
> > also doesn't work
> >
> Hi Öznur.
>
> The problem is that wildcards in regexes will match either the maximum
> number of characters for a match to work (.*) or the mimumum (.*?) and
> nothing in between. The only way I can think of to do this is to
> put an explicit count on your first field and try all possible values,
> like the program below. Others are likely to come up with something
> neater.
>
> HTH,
>
> Rob
>
>
>
> use strict;
> use warnings;;
>
> my $sequence = 'xHxxHyyKzDt';
>
> foreach my $n (1 .. length $sequence) {
>
> next unless $sequence =~ /^(.{$n})H(.+)K(.+)D(.+)/;
>
> printf "\$1 = %-6s", $1;
> printf "\$2 = %-6s", $2;
> printf "\$3 = %-6s", $3;
> printf "\$4 = %-6s", $4;
> print "\n\n";
> }
>
> **OUTPUT
>
> $1 = x $2 = xxHyy $3 = z $4 = t
>
> $1 = xHxx $2 = yy $3 = z $4 = t
>
>
>Öznur taþtan Guest
-
Rob #6
Re: all matches of a regex
Öznur Tastan wrote:
Hi Öznur.
Yes, I did suspect my previous answer wouldn't handle the general case, but
I hoped it may be good enough. I'm sure it's not possible using a simple regex.
I've written subroutine split_list() below, which takes a string of characters to
split on and a string to split, a little like the split() built-in. It finds all
the ways to split the string at each of the characters in sequence.
It's a recursive subroutine to make it neater. It works by finding a way to
split on the first character in the list and then calling itself to split
the right-hand half on the remaining characters.
The return value is an array of all the possibilities with the split
characters replaced with hyphens.
Post to the list if you need any help with any part of it.
(Thanks, this was an interesting little problem!)
Cheers,
Rob
use strict;
use warnings;
sub split_list {
my ($list, $string) = @_;
return ($string) unless $list =~ s/(.)//;
my $split = $1;
my @splits;
while ( $string =~ /$split/g ) {
my $pos = pos $string;
my $left = substr $string, 0, $pos - 1;
my $right = substr $string, $pos;
push @splits, map "$left-$_", split_list($list, $right);
}
return @splits;
}
my @ret = split_list ('HKD', 'xHxxHyyKzDt');
print map "$_\n", @ret;
**OUTPUT
x-xxHyy-z-t
xHxx-yy-z-t
Rob Guest
-
Unregistered #7
Re: all matches of a regex
my $string = "pass test1\n abasdfasdfkl asd asfk as;d as asdf sdja s;lfjasldk f sdlk fjs\n\n\nal;s asdlkj fsdfjl;jkdsf pass test2\n ads;lk pass test3\n as;ljk";
for my $val ($string =~ /pass\s(.*)\n/g) {
print ("\n\n\n$val \n\n\n");
}
output:
test1
test2
test3Unregistered Guest



Reply With Quote

