Christopher Brannon writes: ... A bunch of pdf conversion utilities. These are about all that I know of too. I did find that keeping your xpdf version up to date helped, it's cracked some things my old version wouldn't do, probably to do with compression. Note that the most useful string in many documents is the email address of the author, see below for an example. >There may even be lossage when the file contains English words. For example, >one PDF file contained the word "modifications". The web-based tools produced >"modi cations" as output. Ah, this is probably good news. That's most likely because of the unrecognized "fi" ligature which *probably* means the document was originally TeX. That's my experience anyway. If so, ask the author for the original. I've never been refused and that's certainly the formatter of choice for anyone doing serious mathematics. And it's a joy to read. >PDF is a complex file format. Writing a translator for it is certainly no >mean feat, I am sure. The problem is that pdf is such a *simple* format, these translators are reconstructing stuff from too little information. >So those are the possibilities for converting pdf to text under Linux, as >I see them. >None is perfect, though some work well. >What do I do? Help! This looks about as good as it gets I'm afraid. >But the reverse seems to be the case. I'm seeing a lot of pdf these days. >Has it always been this way? Maybe I'm just running into more pdf because >I'm researching cryptography. >Will it change in the forseeable future? Anything in the IT area will change within the foreseeable future, question is will it get better or worse. Part of the solution lies in reltively simple changes to authoring tools. For example, I'd like pdflatex to include the original document as an alternative mark-up, I *think* the pdf spec now allows this. Then there's the whole business of extracting maths from Word files ... I'm getting nowhere with that. >cheers Peter > _______________________________________________ Blinux-list@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/blinux-list