On Thu, Aug 14, 2008 at 12:54 AM, Martin Langhoff <martin.langhoff@xxxxxxxxx> wrote: > On Thu, Aug 14, 2008 at 3:26 AM, David Tweed <david.tweed@xxxxxxxxx> wrote: >> FWIW, PDF format is a mix of sections of uncompressed higher level >> ASCII notation and sections of compressed actual glyph/location data > > The PDF spec allows compression of the "text" sections - if a PDF is > uncompressed, it's a good candidate for delta & compression. > Unfortunately, within the same file you might have an embedded JPEG. Sure, all I was pointing out was that even pdfs with compressed page contents can look like uncompressed text from looking at the entropy of the first 4k or 8k. -- cheers, dave tweed__________________________ david.tweed@xxxxxxxxx Rm 124, School of Systems Engineering, University of Reading. "while having code so boring anyone can maintain it, use Python." -- attempted insult seen on slashdot -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html