On Mon, Dec 27, 2010 at 07:19:39PM +0200, Petre Rodan wrote: > > Hello Dave, > > On Tue, Dec 28, 2010 at 01:07:50AM +1100, Dave Chinner wrote: > > Turn on the XFS tracing so we can see what is being written every > > 36s. When the problem shows up: > > > > # echo 1 > /sys/kernel/debug/tracing/events/xfs/enable > > # sleep 100 > > # cat /sys/kernel/debug/tracing/trace > trace.out > > # echo 0 > /sys/kernel/debug/tracing/events/xfs/enable > > > > And post the trace.out file for us to look at. > > attached. > > you can disregard all the lvm partitions ('dev 254:.*') since they are on a different drive, probably only 8:17 is of interest. Ok, I can see the problem. The original patch I tested: http://oss.sgi.com/archives/xfs/2010-08/msg00026.html Made the log covering dummy transaction a synchronous transaction so that the log was written and the superblock unpinned immediately to allow the xfsbufd to write back the superblock and empty the AIL before the next log covering check. On review, the log covering dummy transaction got changed to an async transaction, so the superblock buffer is not unpinned immediately. This was the patch committed: http://oss.sgi.com/archives/xfs/2010-08/msg00197.html As a result, the success of log covering and idling is then dependent on whether the log gets written to disk to unpin the superblock buffer before the next xfssyncd run. It seems that there is a large chance that this log write does not happen, so the filesystem never idles correctly. I've reproduced it here, and only in one test out of ten did the filesystem enter an idle state correctly. I guess I was unlucky enough to hit that 1-in-10 case when I tested the modified patch. I'll cook up a patch to make the log covering behave like the original patch I sent... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs