Re: [PATCH 1/2] staging: Add Snappy compression library

Nitin Gupta <ngupta@xxxxxxxxxx> · Wed, 20 Apr 2011 11:23:25 -0400

On 04/19/2011 10:01 PM, Zeev Tarantov wrote:
On Wed, Apr 20, 2011 at 04:09, Nitin Gupta<ngupta@xxxxxxxxxx>  wrote:
On 04/19/2011 08:01 PM, Zeev Tarantov wrote:
On Tue, Apr 19, 2011 at 14:31, Nitin Gupta<ngupta@xxxxxxxxxx>    wrote:
I'm in the process of writing a simple test for all these algorithms:

http://code.google.com/p/compcache/source/browse/sub-projects/fstats/fstats.c

This compresses input file page-by-page and dumps compressed sizes and
time
taken to compress. With this data, I hope to come up with an adaptive
scheme
which uses different compressors for different pages such that overall
compression ratio stays good while not hogging CPU like when zlib is used
alone.

I have extended the block compressor tester I've written for Dan
Magenheimer

(http://driverdev.linuxdriverproject.org/pipermail/devel/2011-April/015127.html)
to show this data.
It compresses a file one page at a time, computes a simple histogram
of compressed size, keeps elapsed cpu time and writes the compressed
blocks (and index) so the original file can be restored (to prevent
cheating, basically).
Code: https://github.com/zeevt/csnappy/blob/master/block_compressor.c
Results:
https://github.com/zeevt/csnappy/blob/master/block_compressor_benchmark.txt

Because people don't click links, results inlined:

fstats now does all this and gnuplot does histogram :)
Anyways, I don't have any issues with links and don't have any problems copy
pasting your stuff here if needed for context. Still, when posting your
patches, it would be better to keep some of these performance numbers in the
patch description.

fstats in mercurial is at the original version, with no snappy support
and with zlib reallocating memory inside the timed inner loop.

I just forgot to commit the changes. It should be reflected in the 
repository now.

Anyway, are the benchmarks I posted enough? Should I use more
different kinds of data? What kind of data do people want to compress
in ram?
Would making a mode of the tester that accepts pid instead of path and
compresses the read-only mapped pages in /proc/<pid>/maps be more
interesting?

Since zram as swap disks is an important use case, we should really test 
against "anonymous" data i.e. data which is written out to swap disks 
under memory pressure. For this, you can use "SwapReplay"[1]  -- a set 
of userspace utilities and kernel module -- to test compressibility of 
such pages. This is the same infrastructure that was used to evaluate LZO.

Links:
[1] http://code.google.com/p/compcache/wiki/SwapReplay
[2] 
http://code.google.com/p/compcache/source/browse/#hg%2Fsub-projects%2Fswap_replay

Another zram use case is as a compressed RAM disk. We can already see 
/var/run mounted over tmpfs on Fedora-15. We can see zram could replace 
tmpfs in this case at least for memory constrained systems.
So for case of zram as generic filesystem disk, we need to test 
compressibility with different kinds of files: binaries from /usr/lib64, 
text file archive and other kinds of data you can think of. For now, I 
have just tested with fstats on archive of /usr/lib64, ISO from project 
Gutenberg 
(http://www.gutenberg.org/wiki/Gutenberg:The_CD_and_DVD_Project -- march 
2007). I will post this data soon.

What do you think of the submitted patch as -is?

csnappy as a compile time option is definitely welcome.

Manual runtime switching is not really needed -- instead of more knobs for
this, it would be better to have some simple scheme to automatically switch
between available compressors.

Would you please (test and) ack this, then?
http://driverdev.linuxdriverproject.org/pipermail/devel/2011-April/015126.html
Or does it need changes?

A quick look shows nothing wrong with the code. But please let me have a 
better look at it with some testing. I will reply to that soon.

RE: adaptive compression method.
What kind of simple scheme did you have in mind? I will gladly prototype it.

Currently, I have very simple ideas but the gist to keep the compression 
ratio high while not pounding CPUs (as would happen if say zlib was used 
alone).

A simple adaptive compression scheme can be:
 - Look at compressor performance (mainly compression ratio) for last, 
say, 8 pages.
 - If performance is good, keep the same compressor for next 8 pages.
 - If performance is not so good, switch to a better/"heavier" 
compressor (zlib) and let it handle next 8 pages.
 - If even best of our compressor fails to give good compression, turn 
off any compression for next 8 pages and after that try again with 
lightest of compressors (csnappy).
 - If cnappy is performing well and CPU load is low, gradually switch 
to heavier compressors.
 - We also switch from heavier to lighter compressor if say zlib is 
compressing pages *really* well -- this implies that the pages are very 
well compressible to its probably better to use lighter algorithm for 
similar compression ratio and about 5x less CPU overhead
 - We could also combine metrics like current CPU load and memory 
pressure (under higher memory pressure, favour zlib etc).

I'm sure many of the details are missing but this is approximately the idea.

If you're serious about resorting to zlib is some cases, maybe it
should be tuned for in-memory compression instead of archiving: The
streaming interface slows it down. It has adler32 checksum zram
doesn't need. The defaults are not tuned for 4KB input
(max_lazy_match, good_match, nice_match, max_chain_length).

I've almost no experience working on compression algorithms themselves, 
so it would be helpful if you yourself make these optimizations and post 
the results. When using zlib as-is, I found CPU overhead (w.r.t LZO) of 
about 5x! (I will post detailed data from fstats soon).

Thanks,
Nitin
_______________________________________________
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxx
http://driverdev.linuxdriverproject.org/mailman/listinfo/devel