Re: Select count(*), the sequel

Thomas Kellerer <spam_eater@xxxxxxx> · Wed, 27 Oct 2010 22:54:11 +0200

Kenneth Marshall, 27.10.2010 22:41:
Different algorithms have been discussed before. A quick search turned
up:

quicklz - GPL or commercial
fastlz - MIT works with BSD okay
zippy - Google - no idea about the licensing
lzf - BSD-type
lzo - GPL or commercial
zlib - current algorithm

Of these lzf can compress at almost 3.7X of zlib and decompress at 1.7X
and fastlz can compress at 3.1X of zlib and decompress at 1.9X. The same
comparison put lzo at 3.0X for compression and 1.8X decompress. The block
design of lzl/fastlz may be useful to support substring access to toasted
data among other ideas that have been floated here in the past.

Just keeping the hope alive for faster compression.

What about a dictionary based compression (like DB2 does)?

In a nutshell: it creates a list of "words" in a page. For each word, the occurance in the db-block are stored and the actual word is removed from the page/block itself. This covers all rows on a page and can give a very impressive overall compression.
This compression is not done only on disk but in-memory as well (the page is loaded with the dictionary into memory).

I believe Oracle 11 does something similar.

Regards
Thomas

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance