On Fri, Jul 30, 2010 at 08:56:58AM +1000, Dave Chinner wrote: > On Fri, Jul 30, 2010 at 12:05:46AM +1000, Nick Piggin wrote: > > On Wed, Jul 28, 2010 at 11:17:44PM +1000, Dave Chinner wrote: > > > Something very strange is happening, and to make matters worse I > > > cannot reproduce it with a debug kernel (ran for 3 hours without > > > failing). Hence it smells like a race condition somewhere. > > > > > > I've reproduced it without delayed logging, so it is not directly > > > related to that functionality. > > > > > > I've seen this warning: > > > > > > Filesystem "ram0": inode 0x704680 background reclaim flush failed with 117 > > > > > > Which indicates we failed to mark an inode stale when freeing an > > > inode cluster, but I think I've fixed that and the problem still > > > shows up. It's posible the last version didn't fix it, but.... > > > > I've seen that one a couple of times too. Keeps coming back each > > time you echo 3 > /proc/sys/vm/drop_caches :) > > Yup - it's an unflushable inode that is pinning the tail of the log, > hence causing the log space hangs. > > > > Now I've got the ag iterator rotor patch in place as well and > > > possibly a different version of the cluster free fix to what I > > > previously tested and it's now been running for almost half an hour. > > > I can't say yet whether I've fixed the bug of just changed the > > > timing enough to avoid it. I'll leave this test running over night > > > and redo individual patch testing tomorrow. > > > > I reproduced it with fs_stress now too. Any patches I could test > > for you just let me know. > > You should see them in a few minutes ;) It's certainly not locking up like it used to... Thanks! _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs