Re: Reiser4. BEST FILESYSTEM EVER - Christer Weinigel

Christer Weinigel <christer@xxxxxxxxxxx> · 09 Apr 2007 03:24:24 +0200

[linux-kernel trimmed from the Cc list]

johnrobertbanks@xxxxxxxxxxx writes:

> [lots of drivel with lots of capital letters elided]
> [a totally confused mess of responses to at least three different mails]

Wow.  You are absolutely amazing.  But, oh, well.

> By the way: Did I thank you "delightful" people for the "pleasant"
> welcome to the linux-kernel mailing list?

You are so welcome.  *grin*

> > So the two bonnie benchmarks with lzo and gzip are
> > totally meaningless for any real life usages.
> 
> YOU (yes, the one with no experience and next to NO knowledge on the
> subject) claim that because bonnie++ writes files that are mostly zeros,
> the results are meaningless. It should be mentioned that bonnie++ writes
> files that are mostly zero for all the filesystems compared. So the
> results are meaningful, contrary to would you claim.

Ok, lets take this really slowly so that you may understand.

Compression in the file system can be useful, there is no doubt about
that, but you have to be aware of the tradeoffs.

First of all, very few people have any use for storing files
consisting of just zeroes.  Trying to make any decision based on a
file systems ability to compress zeroes is just plain dumb.

Bonnie++ assumes that the data it writes will end up being written to
disk and not be compressed.  Right now it allocates a buffer which is
filled with zeroes and half a dozen bytes that the beginning of the
buffer are filled in with some random data.  So to make the bonnie
runs have any meaning on any compressed file systems you really want
to be able to choose what data it writes to disk.  If you modified
bonnie to do multiple test runs, one with zeroes, another one with
some easily compressed data such as a syslog, some not so easily
compressed data such as the contents of /bin/bash and some
uncompressible data such as from /dev/urandom that would be a lot
better benchmark.  

Then you have to be aware of the cost of that compression.  First of
all, it is going to use some CPU, so measuring the CPU load during the
benchmark is a good start.  Another thing with compression is that it
requires you to keep both some compressed and the uncompressed data in
RAM at the same time, so the memory pressure will increase.  This is
harder to measure and quantify.  Finally, since the CPU has to get
involved at compressing and decompressing the data doing this will
pull both the uncompressed and compressed data into the CPU cache and
may evict data from the cache that other processes would have found
useful.  This cache pollution is even harder to measure.  And none of
these costs make any difference for benchmarks run on a lightly loaded
system, but may make a difference in real life on any system that
tries to do something useful at the same time.

Then you have to consider your use cases.  As I said in my previous
mail, for my only space constrained disk, I store a lot of large flac
encoded CD images.  That data is basically uncompressible, so
compression buys me nothing, it just costs me a lot of extra CPU to
try to compress uncompressible data.  In addition, each CD image is a
quite large too, about 300MByte for a full size album, so whatever
savings I can get though the tail merging that Reiser4 (and Reiser3)
does is marginal for my use case.

Other use cases might have a lot to gain from compression.

> ALSO YOU IGNORE examples offered by others, on lkml, which contradict
> your assertion: FOR EXAMPLE:
> 
> > I see the same thing with my nightly scripts that do syslog
> > analysis, last year I trimmed 2 hours from the nightly run by
> > processing compressed files instead of uncompressed ones (after I
> > did this I configured it to compress the files as they are rolled,
> > but rolling every 5 min the compression takes <20 seconds, so the
> > compression is < 30 min)
> 
> David has said that compressing the logs takes
> 
> 24 x 12 x 20 secs = 5,760 secs = 1.6 hours of CPU time (over the day)
> 
> but he saves 2 hours of CPU time on the daily syslog analysis.
> 
> For a total (minimum) saving of 24 minutes.

So lets look at the syslog case then.  First of all, lets compress my
syslog with gzip:

    gzip -c /var/log/messages >whole.gz

    du -h /var/log/messages whole.gz
    532K    messages
    64K     whole.gz

Unfortunately, this compressed format isn't very efficient for some
use cases, lets say that I want to read the last 10 lines of the
syslog.  On a normal uncompressed file system I can just seek to the
end of the file, read the last block and get those 10 lines (or if the
last block didn't have 10 lines, I can try the block before that).
But with a compressed file, I have to uncompress the whole file and
throw away 531 kBytes at the begining of the file to do that.  So a
file system that wants to give the user efficient random access to a
files can't compress the whole file as done above.  It has to make
some tradeoffs to make random access practically usable.  Most
compressing file systems do that by splitting the file into fixed
chunks which are compressed independently of each other.  So lets
simulate that by splitting the file into 4k chunks and compressing
those separately and then combining them together:

    split -b 4096 /var/log/messages chunks
    gzip chunks*
    cat chunks*.gz >combined.gz

    du -h combined.gz
    120K    combined.gz

The reason for the loss of compression is that the chunk based
compression can't reuse knowledge across chunks.  It's possible to
mitigate this by increasing the chunks size:

    84K     combined-16368-chunk-size.gz
    72K     combined-65536-chunk-size.gz

once again that has a downside, the bigger the chunks, the more data
will be uncompressed unneccesarily when doing random accesses in the
file.

So the syslog example you are quoting above does not tell you how well
reiser4 will do on that specific use case.  A lot of the benefit in
David's example comes from knowing that he wants to process the file
as a whole and doesn't need random access.  So having application
specific knowledge and doing the compression outside of the file
system is what gives him that gain.

Of course, it may also be that the convenience of having transparent
compression in the file system is more worth more than the 50% benefit
in size from having to compress the syslog manually.  That depends.

> but he saves 2 hours of CPU time on the daily syslog analysis.

And no, he spends 1.6 hours of CPU time (maybe, some IO wait is
probaby included in that number) to save 2 hours of runtime (mostly IO
wait I assume).  So it seems that the disk is the bottle neck in his
case.  On a slightly different system CPU might be the bottle neck
because the same machine has to do a lot of processing at the same
time, so that it's better to skip the compression.  Once again, it
depends.

What I'm trying to get at here is that yes, compression can be useful,
but it is usese dependent, and it's impossible to catch all the
nuances of all use cases in one single number, especially an extremely
artificial number such as a bonnie++ run with files mostly consisting
of zeroes.

You can foam at the mouth and post the same meaningless, benchmark
figures over and over again and yell even louder, but that still don't
make them relevant.  It's not reiser4 that is the problem, but the way
you try to present reiser4 as the best thing since sliced bread.

To misquote Scott Adams: I'm not anti-Reiser4, I'm anti-idiot.  

  /Christer

-- 
"Just how much can I get away with and still go to heaven?"

Christer Weinigel <christer@xxxxxxxxxxx>  http://www.weinigel.se
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html