ls performance

Brent A Nelson <brent@xxxxxxxxxxxx> · Tue, 19 Feb 2008 18:37:15 -0500 (EST)

I am working with a unified AFR filesystem, with 4 servers, AFR on the 
client-side, the clients are the servers, and the namespace is also AFRed.

I notice that with multiple dd processes (spread across the machines) 
writing to the filesystem, ls -l (and even just ls, which is odd, as it 
should only access the namespace shares) are rather slow (10s of seconds 
to several minutes).  Splitting the server shares into multiple glusterfsd 
processes helps, and not using the performance translators seems to help a 
little (perhaps because the server processes are then less in-demand).

Also, I notice that when using rm to remove 10GB files, ls will hang 
completely until the rm processes have finished (blocking?).

Reads impact ls performance, too, but to a much, much smaller degree.

I might consider the possibility that my gigabit links are saturated, but 
my ssh sessions are perfectly responsive.  I can ssh to the server nodes 
and ls -l the shares directly far faster than through the GlusterFS mount, 
when multiple high-speed writes are occurring.

Any ideas on how to improve the ls performance? Could GlusterFS be tweaked 
to give priority (perhaps a separate thread) to metadata-type queries over 
writes (and especially rm)? Such metadata queries need very little 
bandwidth but latency should be reasonably low, or the filesystem will 
feel terribly unresponsive, even if it's handling reads and writes rather 
well.

Thanks,

Brent