On 8/5/09 7:12 AM, "Merlin Moncure" <mmoncure@xxxxxxxxx> wrote: > On Tue, Aug 4, 2009 at 4:40 PM, Tom Lane<tgl@xxxxxxxxxxxxx> wrote: >> Scott Carey <scott@xxxxxxxxxxxxxxxxx> writes: >>> There are a handful of other compression algorithms very similar to LZO in >>> performance / compression level under various licenses. >>> LZO is just the best known and most widely used. >> >> And after we get done with the license question, we need to ask about >> patents. The compression area is just a minefield of patents. gzip is >> known to avoid all older patents (and would be pretty solid prior art >> against newer ones). I'm far less confident about lesser-known systems. > > I did a little bit of research. LZO and friends are variants of LZW. > The main LZW patent died in 2003, and AFAIK there has been no patent > enforcement cases brought against LZO or it's cousins (LZO dates to > 1996). OK, I'm no attorney, etc, but the internet seems to believe > that the algorithms are patent free. LZO is quite widely used, in > both open source and some relatively high profile commercial projects. > That doesn't sound right to me, LZW is patent protected in a few ways, and is a LZ78 scheme. LZO, zlib, and the others here are LZ77 schemes which avoid the LZW patents. There are some other patents in the territory with respect to how the hash lookups are done for the LZ77 'sliding window' approach. Most notably, using a tree is patented, and a couple other (obvious) tricks that are generally avoided anyway for any algorithms that are trying to be fast rather than produce the highest compression. http://en.wikipedia.org/wiki/Lossless_data_compression#Historical_legal_issu es http://en.wikipedia.org/wiki/LZ77_and_LZ78 http://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch http://www.faqs.org/faqs/compression-faq/part1/section-7.html http://www.ross.net/compression/patents.html Note, US patents are either 17 years after grant, or 20 years after filing. A very large chunk of those in this space have expired, but a few were filed/granted in the early 90's -- though those are generally more specific and easy to avoid. Or very obvious duplicates of previous patents. More notably, one of these, if interpreted broadly, would apply to zlib as well (Gibson and Graybill) but the patent mentions LZRW1, and any broader scope would have prior art conflicts with ones that are now long expired. Its 17 years after grant on that, but not 20 years after filing. > I downloaded the libraries and did some tests. > 2.5 G sql dump: > > compression time: > zlib: 4m 1s > lzo: 17s > fastlz: 28.8s > liblzf: 26.7s > > compression size: > zlib: 609M 75% > lzo: 948M 62% > fastlz: 936M 62.5% > liblzf: 916M 63.5% > Interesting how that conflicts with some other benchmarks out there (where LZO ad the others are about the same). But, they're all an order of magnitude faster than gzip/zlib. > A couple of quick notes: liblzf produces (possibly) architecture > dependent archives according to its header, and fastlz is not declared > 'stable' according to its website. > > merlin > -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance