GlusterFS 3.0.2 small file read performance benchmark

john at feurix.com (John Feuerstein) · Sat, 27 Feb 2010 19:56:46 +0100

Another thing that makes me wonder is the read-subvolume setting:

> volume afr
>   type cluster/replicate
>   ...
>   option read-subvolume node2
>   ...
> end-volume

So even if we play around and set this to the local node or some remote
node respectively, it won't gain any performance for small files. Looks
like the whole bottleneck for small files is meta-data and the global
namespace lookup.

It would be really great if all of this could be cached within io-cache,
only falling back to a namespace query (and probably locking) if
something wants to write to the file, or if the result is longer than
cache-timeout seconds in the cache. So even if the file has been
renamed, is unlinked, has changed permissions / metadata - simply take
the version of the io-cache until it's invalidated. At least that is
what I would expect the io-cache to do. This will introduce a
discrepancy between the cached file version and the real version in the
global namespace, but isn't that what one would expect from caching...?

Note that the cache-size was in all tests on all nodes 1024MB, and the
whole set of test-data was ~240MB. Add some meta-data and it's probably
at 250MB. In addition, cache-timeout was 60 seconds, while the whole
test took around 40 seconds.

So *all* of the read-only test could have been served completely by the
io-cache... or am I mistaken here?

I'm trying to understand the poor performance, because network latency
should be eliminated by the cache.

Could some Gluster-Dev please elaborate a bit on that one?

Best Regards,
John