I am working with a unified AFR filesystem, with 4 servers, AFR on the
client-side, the clients are the servers, and the namespace is also AFRed.
I notice that with multiple dd processes (spread across the machines)
writing to the filesystem, ls -l (and even just ls, which is odd, as it
should only access the namespace shares) are rather slow (10s of seconds
to several minutes). Splitting the server shares into multiple glusterfsd
processes helps, and not using the performance translators seems to help a
little (perhaps because the server processes are then less in-demand).
Also, I notice that when using rm to remove 10GB files, ls will hang
completely until the rm processes have finished (blocking?).
Reads impact ls performance, too, but to a much, much smaller degree.
I might consider the possibility that my gigabit links are saturated, but
my ssh sessions are perfectly responsive. I can ssh to the server nodes
and ls -l the shares directly far faster than through the GlusterFS mount,
when multiple high-speed writes are occurring.
Any ideas on how to improve the ls performance? Could GlusterFS be tweaked
to give priority (perhaps a separate thread) to metadata-type queries over
writes (and especially rm)? Such metadata queries need very little
bandwidth but latency should be reasonably low, or the filesystem will
feel terribly unresponsive, even if it's handling reads and writes rather
well.
Thanks,
Brent