Ask a Question related to PERL Beginners, Design and Development.
-
David T-G #1
how to parse blank-line-separated records
Hi, all --
I'm wrestling with a data file containing owners and contact info and it
suddenly occurred to me that I could probably change my record separator
from \n to \n\n (a blank line) and grab the whole record that way.
Assuming I figure out how to do that, then how do I match the pieces?
The file looks a lot like
header stuff
code unit
owner home_phone work_phone
addr
city, st zip
where any of the phone numbers or the addresses might be missing, but we
can count on the column positions for formatting (and thus parsing).
So I probably go through a
while (<>)
loop and it sucks in each record for me, but then how do I match to get
the various pieces -- around the newlines?
Yes, sample code would be welcome :-) So would pointers to where this
has been done before; I'm just not finding it as I read the code examples
from _Programming_ (2e) this morning :-(
TIA & HAND
:-D
--
David T-G * There is too much animal courage in
(play) [email]davidtg@justpickone.org[/email] * society and not sufficient moral courage.
(work) [email]davidtgwork@justpickone.org[/email] -- Mary Baker Eddy, "Science and Health"
[url]http://justpickone.org/davidtg/[/url] Shpx gur Pbzzhavpngvbaf Qrprapl Npg!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (FreeBSD)
iD8DBQE/ZbVYGb7uCXufRwARAkuYAJ9n2+e49gNMyvlGEp7c2MeZFGe7sw CeNJZc
B6h/2EnwO5GD7do4V8CXOjw=
=sJEx
-----END PGP SIGNATURE-----
David T-G Guest
-
line separated file
I have line separated file. Every line represent something. I need to extraxt these lines into Array or something else so after that I can... -
blank line between records
this sort of requirement requires custom work to be done in the itemdatabound an approach would be to check the type of cell being fired (header,... -
#25256 [Opn->Bgs]: Parse error: parse error, unexpected $ in ... on line 642
ID: 25256 Updated by: iliaa@php.net Reported By: a dot schat at streamedge dot com -Status: Open +Status: ... -
#25256 [NEW]: Parse error: parse error, unexpected $ in ... on line 642
From: a dot schat at streamedge dot com Operating system: Linux PHP version: 4.3.1 PHP Bug Type: Compile Failure Bug... -
Getting a comma separated list of all records from a view.
What I would like to do is add records from a view to a table based upon user input. What is making it hard is since I can't give an identity field... -
Bob Showalter #2
RE: how to parse blank-line-separated records
David T-G wrote:
Yes, or set $/ = '' to get "paragraph" mode. That's a little more flexible,> Hi, all --
>
> I'm wrestling with a data file containing owners and contact
> info and it
> suddenly occurred to me that I could probably change my
> record separator
> from \n to \n\n (a blank line) and grab the whole record that way.
as Perl will treat any sequence of multiple blank lines as a record
separator.
If the paragraphs have a definite fixed format, usually unpack() is the> Assuming I figure out how to do that, then how do I match the pieces?
>
> The file looks a lot like
>
> header stuff
> code unit
> owner home_phone work_phone
> addr city, st zip
>
> where any of the phone numbers or the addresses might be
> missing, but we
> can count on the column positions for formatting (and thus parsing).
>
> So I probably go through a
>
> while (<>)
>
> loop and it sucks in each record for me, but then how do I
> match to get
> the various pieces -- around the newlines?
easiest way to grab the data. Use 'x' in your pattern to skip over bytes,
and 'A' to extract a sequence of bytes. So, if you want to skip 20 chars,
then grab 8 chars, then skip 32 chars, then grab 15 chars, you use:
my @fields = unpack('x20 A8 x32 A15', $record);
(use lowercase 'a' instead of 'A' if you want to preserve trailing blanks on
the fields you extract.)
Otherwise, you can construct regexes that grab what you're looking for.
Depending on your regex, you might need to use the /s and/or /m modifiers to
change the way ^, $, and . match within the multi-line string.
>
> Yes, sample code would be welcome :-) So would pointers to where this
> has been done before; I'm just not finding it as I read the
> code examples
> from _Programming_ (2e) this morning :-(Bob Showalter Guest
-
David T-G #3
Re: how to parse blank-line-separated records
Bob, et al --
...and then Bob Showalter said...
%
% David T-G wrote:
% >
% > suddenly occurred to me that I could probably change my
% > record separator
% > from \n to \n\n (a blank line) and grab the whole record that way.
%
% Yes, or set $/ = '' to get "paragraph" mode. That's a little more flexible,
Ah, yes; that's what I meant.
% as Perl will treat any sequence of multiple blank lines as a record
% separator.
Right. That's a Good Thing(tm).
%
% > Assuming I figure out how to do that, then how do I match the pieces?
% >
% > The file looks a lot like
% >
% > header stuff
% > code unit
% > owner home_phone work_phone
% > addr city, st zip
% >
% > where any of the phone numbers or the addresses might be
% > missing, but we
% > can count on the column positions for formatting (and thus parsing).
...
%
% If the paragraphs have a definite fixed format, usually unpack() is the
% easiest way to grab the data. Use 'x' in your pattern to skip over bytes,
% and 'A' to extract a sequence of bytes. So, if you want to skip 20 chars,
% then grab 8 chars, then skip 32 chars, then grab 15 chars, you use:
%
% my @fields = unpack('x20 A8 x32 A15', $record);
%
% (use lowercase 'a' instead of 'A' if you want to preserve trailing blankson
% the fields you extract.)
Well, they do but the lines might be short. That is, we have
unitcode 101 Short Way
Owner
Address
...
unitcode 206 Longer Circle
Owner vonLongName, III
Address
as well as phone numbers that might or might not be there. I'm not sure
how I'd unpack anything beyond the first possibly-in-a-different-column
newline. I suppose I should have said "we can count on the starting
column positions if we get that far out in a given record", which is
probably different!
%
% Otherwise, you can construct regexes that grab what you're looking for.
% Depending on your regex, you might need to use the /s and/or /m modifiersto
% change the way ^, $, and . match within the multi-line string.
I suppose this will be the way to go, then. I don't see how /s and /m
will change the begin- and end-of-line matching, though... Where do I
look for that?
Thanks again & HAND
:-D
--
David T-G * There is too much animal courage in
(play) [email]davidtg@justpickone.org[/email] * society and not sufficient moral courage.
(work) [email]davidtgwork@justpickone.org[/email] -- Mary Baker Eddy, "Science and Health"
[url]http://justpickone.org/davidtg/[/url] Shpx gur Pbzzhavpngvbaf Qrprapl Npg!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (FreeBSD)
iD8DBQE/ZtivGb7uCXufRwARAoNaAKCqQ2/N6dfd1g8RORL75PfY666k/wCg2iTP
TwwpKBYIxIMwIMfI+MZv9KA=
=ohA2
-----END PGP SIGNATURE-----
David T-G Guest



Reply With Quote

