Ask a Question related to PERL Beginners, Design and Development.
-
John McKown #1
splitting / unpacking line into array
I have an input file which has many subrecords. The subrecord type is
denoted by the first 4 characters of the file. The rest of the line is
formatted like similar to the way that "pack" would format one. That is,
each data point in a subtype is always at the same offset for the same
length. E.g. 10 characters starting at offset 30, or some such. What I'm
considering is using "unpack" and having a hash contain the unpack
template based on the subrecord type. Something like:
while (<FH>) {
my $subrec = substr($_,0,4);
my @values = unpack $template{$subrec}, $_;
....
}
Earlier in the code, I would have created the %template hash which would
have the template associated with the $subrec from the input file.
Is this a decent way to do this? Is there a better way?
--
--
Maranatha!
John McKown
John McKown Guest
-
unpacking a tar-file
marco@Ubuntu:~$ sudo tar -zxvf firefox-1.5.0.4.tar.gz Password: tar: firefox-1.5.0.4.tar.gz: Kan niet open: Onbekend bestand of map tar: Fout is... -
assoctiative array on one line
hi ive always wondered how i can write an associative array on one line. i know how to do it with in an indexed array: var a = i... -
splitting a line by columns
I have a line of text output in columnar form; what's the best way to split it into its requisite parts? Say I have lines of aaaaabbcccccddeee... -
splitting a line by columns<Pine.LNX.4.44.0310120829330.4234-100000@ool-4355dfae.dyn.optonline.net>
Hi -- On Sun, 12 Oct 2003, Mike Campbell wrote: I think the main disadvantage of the above is that it's a bit awkward to write into a... -
splitting an array
Hi All , I have one array of numbers say (12 17 18 19 120 121 122 123 124 379 480 481). Now I want to get the starting and ending of any... -
Wiggins D Anconia #2
Re: splitting / unpacking line into array
> I have an input file which has many subrecords. The subrecord type is
Sounds pretty good to me. One concern, do the sub record types always> denoted by the first 4 characters of the file. The rest of the line is
> formatted like similar to the way that "pack" would format one. That is,
> each data point in a subtype is always at the same offset for the same
> length. E.g. 10 characters starting at offset 30, or some such. What I'm
> considering is using "unpack" and having a hash contain the unpack
> template based on the subrecord type. Something like:
>
> while (<FH>) {
> my $subrec = substr($_,0,4);
> my @values = unpack $template{$subrec}, $_;
> ...
> }
>
> Earlier in the code, I would have created the %template hash which would
> have the template associated with the $subrec from the input file.
>
> Is this a decent way to do this? Is there a better way?
>
have the same number of fields? Using your array to unpack into may
turn into a maintenance nightmare with respect to indexing into it to
get values if the record formats are signficantly different, etc.
Second concern, are you processing the records completely within the
loop or needing to parse them all before doing anything with them? In
the latter case you may need to store them to an array based on type
rather than directly to a 'values' temporary array, etc.
For the first concern you may consider using a hash slice with the keys
being associated with the subtype stored in the original hash where you
retrieve the record format from.
Obviously there is also the potential to use objects here but that may
be overkill depending on what you are doing with the data after you have
unpacked it....
[url]http://danconia.org[/url]
Wiggins D Anconia Guest
-
John McKown #3
Re: splitting / unpacking line into array
On Mon, 2 Feb 2004, Wiggins d Anconia wrote:
Actually, that was only an example. I really hope to have the result>
> Sounds pretty good to me. One concern, do the sub record types always
> have the same number of fields? Using your array to unpack into may
> turn into a maintenance nightmare with respect to indexing into it to
> get values if the record formats are signficantly different, etc.
returned more like:
if ($subrec = '0100') {
($name, $address, $city ) = unpack $template{$subrec}, $_ ;
} elsif ($subrec = '0101') {
($some1, $some2) = unpack $template{$subrec}, $_;
}
and so on for each defined $subrec.
I will be processing the records one at a time and putting them in a> Second concern, are you processing the records completely within the
> loop or needing to parse them all before doing anything with them? In
> the latter case you may need to store them to an array based on type
> rather than directly to a 'values' temporary array, etc.
"persistant storage" for retrieval later in a reporting program. I have
not yet determined what sort of "persistant storage" that I want. Perhaps
DBM, perhaps PostgreSQL, perhaps mySQL, <whatever>.
I may end up not even doing this since PostgreSQL, at least, has a way to
load records from a "flat file". I just like to leave my options open. And
I'm looking a Perl solutions right now mainly because I'm trying to learn
Perl.
<off-topic>
Also, if I find a "nice" Perl solution, I may implement it "in production"
on our mainframe (IBM zSeries) at work. The actual data being parsed is a
RACF (security system) database unload. If I can ftp that data from z/OS
to our Linux/390 system and do all my reporting there, I can save z/OS CPU
utilization. That's because Linux/390 on our zSeries runs on a separate
processor from the z/OS work. The z/OS work cannot use this processor due
to licensing restrictions. So, any work that I can "offload" from z/OS is
a net gain because the IFL (Linux processor) is basically idle right now.
I would then use Perl to create reports which would then be ftp'ed back to
the z/OS system. This gets me "brownie points" by offloading z/OS
processing. We are critically short of z/OS processor power and the next
upgrade would cost 1.5 million dollars in software "upgrade" fees.
If this works for the database unload, I can use a similar system for RACF
reports run against the "reformatted audit logs". Again, getting "brownie
points" for offloading work.
This is why I'm considering a Perl-only solution. I have Perl on our SuSE
Linux/390 system. I do not have any SQL database and am not really good
enough to try to port something like PostgreSQL or mySQL.
</off-topic>
Good idea. I'll keep it in mind.>
> For the first concern you may consider using a hash slice with the keys
> being associated with the subtype stored in the original hash where you
> retrieve the record format from.
>
thanks much!
--
Maranatha!
John McKown
John McKown Guest
-
Wiggins D'Anconia #4
Re: splitting / unpacking line into array
John McKown wrote:
That works, though you have to repeat your unpack over and over (not a> On Mon, 2 Feb 2004, Wiggins d Anconia wrote:
>
>>>>Sounds pretty good to me. One concern, do the sub record types always
>>have the same number of fields? Using your array to unpack into may
>>turn into a maintenance nightmare with respect to indexing into it to
>>get values if the record formats are signficantly different, etc.
>
> Actually, that was only an example. I really hope to have the result
> returned more like:
>
> if ($subrec = '0100') {
> ($name, $address, $city ) = unpack $template{$subrec}, $_ ;
> } elsif ($subrec = '0101') {
> ($some1, $some2) = unpack $template{$subrec}, $_;
> }
>
> and so on for each defined $subrec.
>
big deal) but using the slices you only need it once and don't have to
check the subrec type, though you will again when you use them...unless
again you push to an array in a hash where the key is the subtype and
then just loop over each of the different types, which might make the
code more modular, granted the data structure would be more complicated
(and unordered at that point).
MySQL can load flat files as well, though I don't know about formatted>>>>Second concern, are you processing the records completely within the
>>loop or needing to parse them all before doing anything with them? In
>>the latter case you may need to store them to an array based on type
>>rather than directly to a 'values' temporary array, etc.
>
> I will be processing the records one at a time and putting them in a
> "persistant storage" for retrieval later in a reporting program. I have
> not yet determined what sort of "persistant storage" that I want. Perhaps
> DBM, perhaps PostgreSQL, perhaps mySQL, <whatever>.
>
> I may end up not even doing this since PostgreSQL, at least, has a way to
> load records from a "flat file". I just like to leave my options open. And
> I'm looking a Perl solutions right now mainly because I'm trying to learn
> Perl.
>
files like you describe.
Yikes, I understood just enough of that to know that I am running for> <off-topic>
> Also, if I find a "nice" Perl solution, I may implement it "in production"
> on our mainframe (IBM zSeries) at work. The actual data being parsed is a
> RACF (security system) database unload. If I can ftp that data from z/OS
> to our Linux/390 system and do all my reporting there, I can save z/OS CPU
> utilization. That's because Linux/390 on our zSeries runs on a separate
> processor from the z/OS work. The z/OS work cannot use this processor due
> to licensing restrictions. So, any work that I can "offload" from z/OS is
> a net gain because the IFL (Linux processor) is basically idle right now.
> I would then use Perl to create reports which would then be ftp'ed back to
> the z/OS system. This gets me "brownie points" by offloading z/OS
> processing. We are critically short of z/OS processor power and the next
> upgrade would cost 1.5 million dollars in software "upgrade" fees.
>
the hills :-)... Though I will say that it should be doable, and I
assume you have checked out Net::FTP...
If you decide against using a "real" database you might consider using> If this works for the database unload, I can use a similar system for RACF
> reports run against the "reformatted audit logs". Again, getting "brownie
> points" for offloading work.
>
> This is why I'm considering a Perl-only solution. I have Perl on our SuSE
> Linux/390 system. I do not have any SQL database and am not really good
> enough to try to port something like PostgreSQL or mySQL.
>
some of the CSV text file modules, there is even a DBD::CSV that will
allow you to implement using "real" SQL and the DBI if in the future you
might get to port to a database and don't want to change the code later,
though it is not speedy by any means. There is also XML, but that is
all I will say for now :-)....
Good luck,> </off-topic>
>>>>For the first concern you may consider using a hash slice with the keys
>>being associated with the subtype stored in the original hash where you
>>retrieve the record format from.
>>
>
> Good idea. I'll keep it in mind.
>
> thanks much!
>
[url]http://danconia.org[/url]
Wiggins D'Anconia Guest



Reply With Quote

