pdf trouble

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christopher Brannon writes:
... A  bunch of pdf conversion utilities.  
These are  about all that I know of too.  I did find that keeping your
xpdf version up to date helped, it's cracked some things my old
version wouldn't do, probably to do with compression.    Note that the
most useful string in many documents is the email address of the
author, see below for an example.   
>There may even be lossage when the file contains English words.  For example,
>one PDF file contained the word "modifications".  The web-based tools produced
>"modi cations" as output.
Ah, this is probably good news.  That's most likely because of the
unrecognized  "fi" ligature which *probably*   means the document was
originally TeX.  That's my experience anyway.  If so, ask the author
for the original. I've never been refused and that's certainly the
formatter of choice for anyone doing serious mathematics.  And  it's
a joy to read.  
>PDF is a complex file format.  Writing a translator for it is certainly no
>mean feat, I am sure.
The problem is that pdf is such a *simple* format, these translators
are reconstructing stuff from too little information.  
>So those are the possibilities for converting pdf to text under Linux, as
>I see them.
>None is perfect, though some work well.
>What do I do?  Help!
This looks about as good as it gets I'm afraid.   
>But the reverse seems to be the case.  I'm seeing a lot of pdf these days.
>Has it always been this way?  Maybe I'm just running into more pdf because
>I'm researching cryptography.
>Will it change in the forseeable future?
Anything in the IT area will change within the foreseeable future,
question is will it get better or worse.    Part of the solution lies
in reltively  simple changes to authoring tools.  For example, I'd
like pdflatex to include the original document as an alternative
mark-up, I *think* the pdf spec now allows this.  
Then there's the whole business of extracting maths from Word files
... I'm getting nowhere with that. 
>cheers 
Peter

>



_______________________________________________

Blinux-list@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/blinux-list

[Index of Archives]     [Linux Speakup]     [Fedora]     [Linux Kernel]     [Yosemite News]     [Big List of Linux Books]