On Wednesday 30 January 2008, Al Boldi wrote: > And, a quick test of successive 1sec delayed syncs shows no hangs until > about 1 minute (~180mb) of db-writeout activity, when the sync abruptly > hangs for minutes on end, and io-wait shows almost 100%. How large is the journal in this filesystem? You can check via "debugfs -R 'stat <8>' /dev/XXX". Is this affected by increasing the journal size? You can set the journal size via "mke2fs -J size=400" at format time, or on an unmounted filesystem by running "tune2fs -O ^has_journal /dev/XXX" then "tune2fs -J size=400 /dev/XXX". I suspect that the stall is caused by the journal filling up, and then waiting while the entire journal is checkpointed back to the filesystem before the next transaction can start. It is possible to improve this behaviour in JBD by reducing the amount of space that is cleared if the journal becomes "full", and also doing journal checkpointing before it becomes full. While that may reduce performance a small amount, it would help avoid such huge latency problems. I believe we have such a patch in one of the Lustre branches already, and while I'm not sure what kernel it is for the JBD code rarely changes much.... Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html