Ask a Question related to PERL Miscellaneous, Design and Development.
-
Graham #1
Reading Data File Records
I'm a little frustrated with Perl's line-by-line file reading and I am
hoping that someone can help me.
I have a data file that looks like:
--
! Comment 1
! Comment 2
! Comment ...
5 ! number of levels
*aaa [aaa units] ! space deliminated is common
1.0 2.0 3.0 4.0 5.0
*bbb [bbb units] ! csv is possible
1.0, 2.0, 3.0,
4.0 5.0
*ccc [ccc units] ! the file is written from fortran and the number of
columns is not fixed
10.0
20.0
30.0
40.0
50.0
....
--
Essentially, there is a header block that always begins with '!' in
the first column. This is followed by the number of elements in each
data block and an unknown number of data blocks having a set number of
elements.
The file is generated using about five lines of FORTRAN so it seems
somehwat surprising that I am up to 30 lines of perl with almost no
end in sight... Does anyone have an example showing how to process a
file in blocks using Perl?
Thanks,
Graham
Graham Guest
-
Reading - Parsing Records From An LDAP LDIF File In .Net?
Reading - Parsing Records From An LDAP LDIF File In .Net? I am in need of a .Net class that will allow for the parsing of a LDAP LDIF file. An... -
Data File, turn fields on mulitple lines into records on one li ne.. .
Please bottom post... John just provided a good one. Optionally if all you care about is determining whether a line contains a string it might... -
Data File, turn fields on mulitple lines into records on one li ne.. .
Taylor Lewick wrote: Yes, you have the file name in the wrong place. The syntax is "do something to FILE | pipe data to second process | pipe... -
Reading a data file into dropdown box values
Hi, I'm trying to modify a web page file uploader which uses dropdown boxes for directory level choices. Because I have several of these boxes... -
reading data from a text file
Hi, Simple question but I cant seem to find an answer in the help files: How do I read certain text from a text file, as in a string. What should... -
Brian Wakem #2
Re: Reading Data File Records
"Graham" <GrahamWilsonCA@yahoo.ca> wrote in message
news:eda30d78.0309090714.2a6f6431@posting.google.c om...> I'm a little frustrated with Perl's line-by-line file reading and I am
> hoping that someone can help me.
>
> I have a data file that looks like:
>
> --
> ! Comment 1
> ! Comment 2
> ! Comment ...
> 5 ! number of levels
> *aaa [aaa units] ! space deliminated is common
> 1.0 2.0 3.0 4.0 5.0
> *bbb [bbb units] ! csv is possible
> 1.0, 2.0, 3.0,
> 4.0 5.0
> *ccc [ccc units] ! the file is written from fortran and the number of
> columns is not fixed
> 10.0
> 20.0
> 30.0
> 40.0
> 50.0
> ...
> --
>
> Essentially, there is a header block that always begins with '!' in
> the first column. This is followed by the number of elements in each
> data block and an unknown number of data blocks having a set number of
> elements.
>
> The file is generated using about five lines of FORTRAN so it seems
> somehwat surprising that I am up to 30 lines of perl with almost no
> end in sight... Does anyone have an example showing how to process a
> file in blocks using Perl?
What do you want to do with it?
--
Brian Wakem
Brian Wakem Guest
-
James Willmore #3
Re: Reading Data File Records
On 9 Sep 2003 08:14:57 -0700
[email]GrahamWilsonCA@yahoo.ca[/email] (Graham) wrote:
<snip>Post your code - I have no idea what you are trying to do. Maybe it's> The file is generated using about five lines of FORTRAN so it seems
> somehwat surprising that I am up to 30 lines of perl with almost no
> end in sight... Does anyone have an example showing how to process
> a file in blocks using Perl?
just me ;)
--
Jim
Copyright notice: all code written by the author in this post is
released under the GPL. [url]http://www.gnu.org/licenses/gpl.txt[/url]
for more information.
a fortune quote ...
You cannot kill time without injuring eternity.
James Willmore Guest
-
Tulan W. Hu #4
Re: Reading Data File Records
"Graham" <GrahamWilsonCA@yahoo.ca> wrote in message ...
[snip..]I would download the File::Slurp module from cpan and installed it.> The file is generated using about five lines of FORTRAN so it seems
> somehwat surprising that I am up to 30 lines of perl with almost no
> end in sight... Does anyone have an example showing how to process a
> file in blocks using Perl?
[url]http://search.cpan.org/author/MUIR/File-Slurp-2004.0904/[/url]
====
#!/usr/bin/perl
use File::Slurp;
@allLines = read_file("data_file_name");
foreach my $line (@allLine) {
# in case you need process each line
if ($line =~ /^!/) { # comment lines }
else { # datalines}
}
Tulan W. Hu Guest
-
Jay Tilton #5
Re: Reading Data File Records
[email]GrahamWilsonCA@yahoo.ca[/email] (Graham) wrote:
: I have a data file that looks like:
:
: --
: ! Comment 1
: ! Comment 2
: ! Comment ...
: 5 ! number of levels
: *aaa [aaa units] ! space deliminated is common
: 1.0 2.0 3.0 4.0 5.0
: *bbb [bbb units] ! csv is possible
: 1.0, 2.0, 3.0,
: 4.0 5.0
^
^
Should there be a comma between those two values?
: *ccc [ccc units] ! the file is written from fortran and the number of
: columns is not fixed
Is this really how the data file is formatted, or did your newsreader
word-wrap that line for you?
: 10.0
: 20.0
: 30.0
: 40.0
: 50.0
: ...
: --
:
: Essentially, there is a header block that always begins with '!' in
: the first column. This is followed by the number of elements in each
: data block and an unknown number of data blocks having a set number of
: elements.
The problem is determining where one block ends and another begins when
the only thing known about the block is how many elements it contains.
There's no apparent consistency or predictability to how the blocks may
be formatted, or to how the elements are separated. Altering the input
record separator, $/, then reading in a number of records isn't going to
work.
What might work would be to read lines of data until a block's requisite
number of elements have been acquired, but the elements themselves will
need to have a consistent, recognizable format, and a newline character
has to mark the boundary between blocks. From the sample data, the
elemets all seem to be numbers with one place after the decimal.
As a first approximation of workable code,
#!perl
use warnings;
use strict;
my $elems_per_block;
while(<DATA>) {
next if /^!/;
($elems_per_block) = /^(\d+)/;
last;
}
my @blocks;
while(<DATA>) {
my $block = $_;
my $n = 0;
while(<DATA>) {
$block .= $_;
last if $elems_per_block == ($n += () = /(\b\d+\.\d\b)/g);
}
push @blocks, $block;
}
for( @blocks ) {
# whatever processing each block needs
print "Block:\n$_\n";
}
__DATA__
! Comment 1
! Comment 2
! Comment ...
5 ! number of levels
*aaa [aaa units] ! space deliminated is common
1.0 2.0 3.0 4.0 5.0
*bbb [bbb units] ! csv is possible
1.0, 2.0, 3.0,
4.0 5.0
*ccc [ccc units] ! the file is written from fortran and the number of
columns is not fixed
10.0
20.0
30.0
40.0
50.0
: The file is generated using about five lines of FORTRAN so it seems
: somehwat surprising that I am up to 30 lines of perl with almost no
: end in sight...
Why should that be surprising? You're trying to build a modicum of
intelligence into one tool to compensate for another's lack of
sophistication. The Perl program would have a much easier time reading
if the FORTRAN program was only a little better at writing.
Jay Tilton Guest
-
James Willmore #6
Re: Reading Data File Records
On 9 Sep 2003 15:41:03 -0700
[email]GrahamWilsonCA@yahoo.ca[/email] (Graham) wrote:First, let me say that each language is going to handle files and> It seems it isn't just you. All I am trying to do is get the data
> blocks into a suitable perl structure so I can calculate some simple
> statistics and reformat it for another program. See comments in the
> second while loop.
>
> I really appreciate the help. I have a pile of files with this type
> of structure (a legacy of an ancient postdoc) that I need to
> manipulate and reformat.
variables differently. I say this because you commented on using
FORTRAN. I know nothing about FORTRAN, but have had _some_ dealings
with COBOL. Some functionality in COBOL is unavailable in Perl (such
as strictly defining variables). By the same token, there's
functionaility in Perl that is not available in COBOL (such as regular
expressions). Having said that, here is some untested code that _may_
fit the bill for you. Again, it's untested and may _not_ be exactly
what you're looking for. If I'm off, I'm hoping someone will point
out where the errors are.
==untested==
#!/usr/bin/perl -w
use strict;
#define the name of the file
my $file = 'name_of_file_here';
#define a hash (associative array) for your records
my %records;
#open a file handle to the file - die if we can't open it
open(FILE, $file)
or die "Can't open file $file: $!\n";
#get the header - if it's the first line and
#leads with a "!"
my $header = <FILE> if /^!/;
#if you want the number of levels, get the portion before the first
"!"
#can be done with substr - regular expression used for
#demonstration purposes
my $numLev = $1 if $header =~ m/^(.*)!/;
#while the file is open and does not return eof
while(<FILE>){
#chomp the newline off the line
chomp;
#stick the line of the file into variable $line
my $line = $_;
#get the begining of the line up until the first "!"
#(strip the comments)
#again - substr could be used
my $uncommented_line = $1 if m/^(.*)!/;
#if the record is 132 characters in length, separated by
whitespace
#spilt the line on whitespace and place each 'section' into an
array
my @data = split / /, $uncommented_line;
#create the key for the record using the block id
my $key = shift @data;
#store the record as an array into the hash using the block id as the
key
push @{$records{$key}}, @data;
}
#to retrieve the records ...
foreach my $k(sort keys %records){
print "$k => ",join(" ",@{$record{$k}}),"\n";
}
==untested==
HTH
--
Jim
Copyright notice: all code written by the author in this post is
released under the GPL. [url]http://www.gnu.org/licenses/gpl.txt[/url]
for more information.
a fortune quote ...
What this country needs is a good five cent microcomputer.
James Willmore Guest
-
Anno Siegel #7
Re: Reading Data File Records
Jay Tilton <tiltonj@erols.com> wrote in comp.lang.perl.misc:
> [email]GrahamWilsonCA@yahoo.ca[/email] (Graham) wrote:Also, parsing input is generally harder than generating output. Printing> : The file is generated using about five lines of FORTRAN so it seems
> : somehwat surprising that I am up to 30 lines of perl with almost no
> : end in sight...
>
> Why should that be surprising? You're trying to build a modicum of
> intelligence into one tool to compensate for another's lack of
> sophistication. The Perl program would have a much easier time reading
> if the FORTRAN program was only a little better at writing.
what comes along is easy. To read it back in, you must often (as in
the OPs case) understand what you have read so far to know how to
proceed.
The C functions printf() and scanf() are an attempt to make printing
and scanning symmetric. A look at their respective frequency of use
shows that the attempt wasn't a full success.
Anno
Anno Siegel Guest
-
Mike Flannigan #8
Re: Reading Data File Records
Graham wrote:
snip>
> It seems it isn't just you. All I am trying to do is get the data
> blocks into a suitable perl structure so I can calculate some simple
> statistics and reformat it for another program. See comments in the
> second while loop.
>
> I really appreciate the help. I have a pile of files with this type
> of structure (a legacy of an ancient postdoc) that I need to
> manipulate and reformat.
Don't be afraid to slurp the whole file. I slurp 400,000+
line files very quickly and do the processing. The only
trouble is if you do it more than once in the program.
You might see a big slowdown - at least on Win2000.
I never found a good solution to this (yet), so I just
run a bunch on individual perl scripts - one for each
file.
If you find a better solution, let us know.
Mike
Mike Flannigan Guest



Reply With Quote

