On Wed, 20 Feb 2008, Anand Avati wrote:
I am working with a unified AFR filesystem, with 4 servers, AFR on the
client-side, the clients are the servers, and the namespace is also AFRed.
I notice that with multiple dd processes (spread across the machines)
writing to the filesystem, ls -l (and even just ls, which is odd, as it
should only access the namespace shares) are rather slow (10s of seconds
to several minutes). Splitting the server shares into multiple glusterfsd
processes helps, and not using the performance translators seems to help a
little (perhaps because the server processes are then less in-demand).
Do you have io-threads on the *server* ? when you are writing io-threads
pushes the write ops to a seperate thread and keeps the main thread free for
meta data ops.
I did have io-threads on the server. After this email, I tried removing
ALL performance translators, client and server (including io-threads on
the server), and found that I obtained much better ls performance.
Restoring io-threads to the server (as the only performance translator)
was even better (often far better, never significantly worse)!
One or more of the other performance translators, however, either on the
client or the server, seems to harm ls performance significantly, but I
haven't tracked down the culprit. I'll let you know if I do.
Also, I notice that when using rm to remove 10GB files, ls will hang
completely until the rm processes have finished (blocking?).
Is your backend ext3? It is a known issue of rm taking excessively long
times on ext3. We plan to have a workaround to this by considering unlink as
an IO operation rather than metadata op in the future versions.
Yep, it's ext3. I look forward to the unlink change (or maybe a
metadata-threads performance translator?). Right now, large rm tasks
completely block ls lookups until the rm completes.
Thanks,
Brent