On 21/03/2010 9:17 PM, David Newall wrote:
Thanks for all of the suggestions, guys, which gave me some pointers on
new directions to look, and I learned some interesting things.
Unfortunately one of these processes dropped eventually, and, according
to top, the only non-idle process running was gzip (100%.) Obviously
there were postgress and pg_dump processes, too, but they were throttled
by gzip's rate of output and effectively idle (less than 1% CPU). That
is also interesting. The final output from gzip was being produced at
the rate of about 0.5MB/second, which seems almost unbelievably slow.
CPU isn't the only measure of interest here.
If pg_dump and the postgres backend it's using are doing simple work
such as reading linear data from disk, they won't show much CPU activity
even though they might be running full-tilt. They'll be limited by disk
I/O or other non-CPU resources.
and wonder if I should read up on gzip to find why it would work so
slowly on a pure text stream, albeit a representation of PDF which
intrinsically is fairly compressed.
In fact, PDF uses deflate compression, the same algorithm used for gzip.
Gzip-compressing PDF is almost completely pointless - all you're doing
is compressing some of the document structure, not the actual content
streams. With PDF 1.5 and above using object and xref streams, you might
not even be doing that, instead only compressing the header and trailer
dictionary, which are probably in the order of a few hundred bytes.
Compressing PDF documents is generally a waste of time.
--
Craig Ringer
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance