Ask a Question related to PERL Miscellaneous, Design and Development.
-
Chandramohan Neelakantan #1
Converting pdf to text
Hello all,
Problem:
Need to extract text information from a pdf file , write the text
to a file for a hardware project .
The text is contained in a table and has the width and height
information of different layers for a chip
The widthe and height information would be used to create test layouts
for different layers using Cadence SKILL.
OS: Hp-UX
Other tools used: Cadence SKILL
I wanted to do this initial pdf parsing in Perl because:
- it comes with the OS
- No point in writing the pdf parsing tool (which wld be an independen
project then)
- someone must have experienced the parsing proble before
I hope Im clear so far
Searching:
I tried module search on search.cpan.org but as far I have seen, I
dint notice any that extracts the text information from a pdf file.
I also tried seaarching on google but there seems to be pdf2text for
Linux
Solutions:
- I would appreciate if someone could point me to a module/script
that converts pdf 2 text
- any other suggestions in tackling the problem welcome
Many thanks
CM
Chandramohan Neelakantan Guest
-
Converting Text to an Image
How do you convert actual text into an image or symbol? I know i had seen it in there somewhere but cant remember where! -
converting text to paths and more
I am new to Freehand and working on a detailed/layered text file. In Illustrator I know you can click on select all and change everything to... -
converting text into path
how do i convert chinese text into path? as soon as I convert them into paths they become an error text. any idea how i can solve this problem? ... -
Content from a memo field: converting the rich text into plain text
Hi folks, I have an Access 2000 db with a memo field. Into the memo field I put text with bold attributes, URL etc etc What I need to to is... -
converting text to button
I know how to convert text to a button. However when I do just the thin lines in the text are active, the background is not. How do you make the... -
David Efflandt #2
Re: Converting pdf to text
On 10 Sep 2003, Chandramohan Neelakantan <knchandramohan@yahoo.com> wrote:
My system calls it pdf2ascii, which is one of the utilities included with> Hello all,
>
> Problem:
>
> Need to extract text information from a pdf file , write the text
> to a file for a hardware project .
> The text is contained in a table and has the width and height
> information of different layers for a chip
> The widthe and height information would be used to create test layouts
> for different layers using Cadence SKILL.
>
>
> OS: Hp-UX
>
> Other tools used: Cadence SKILL
>
>
>
> I wanted to do this initial pdf parsing in Perl because:
>
> - it comes with the OS
> - No point in writing the pdf parsing tool (which wld be an independen
> project then)
> - someone must have experienced the parsing proble before
>
> I hope Im clear so far
>
>
> Searching:
>
> I tried module search on search.cpan.org but as far I have seen, I
> dint notice any that extracts the text information from a pdf file.
>
>
> I also tried seaarching on google but there seems to be pdf2text for
> Linux
ghostscript (PostScript and PDF language interpreter and previewer). You
might see if 'gs' is either on your system or if ghostscript could be
compiled for HP-UX. See if 'apropos pdf' (or ghostscript) turns up
anything.
Whether that would work depends whether the pdf was created from a text
based source. If the text is in an image (scanned, etc.) you would need
some sort of OCR software to interpret the graphical text.
--
David Efflandt - All spam ignored [url]http://www.de-srv.com/[/url]
[url]http://www.autox.chicago.il.us/[/url] [url]http://www.berniesfloral.net/[/url]
[url]http://cgi-help.virtualave.net/[/url] [url]http://hammer.prohosting.com/~cgi-wiz/[/url]
David Efflandt Guest
-
Vlad Tepes #3
Re: Converting pdf to text
Chandramohan Neelakantan <knchandramohan@yahoo.com> wrote:
You could try using the command line utility pdftotext from the xpdf> Hello all,
>
> Need to extract text information from a pdf file , write the text
> to a file for a hardware project .
distribution. I've got better experience with that tool than with using
pdf2ascii (comes with ghostscript).
Just my two cents,
--
Vlad
Vlad Tepes Guest
-
Chandramohan Neelakantan #4
Re: Converting pdf to text
Many thanks for the tips.
-CM
Vlad Tepes <minceme@start.no> wrote in message news:<bjokeh$fat$1@troll.powertech.no>...> Chandramohan Neelakantan <knchandramohan@yahoo.com> wrote:
>>> > Hello all,
> >
> > Need to extract text information from a pdf file , write the text
> > to a file for a hardware project .
> You could try using the command line utility pdftotext from the xpdf
> distribution. I've got better experience with that tool than with using
> pdf2ascii (comes with ghostscript).
>
> Just my two cents,Chandramohan Neelakantan Guest



Reply With Quote

