Hi Dave, Did the git bisect on the merge tree and end up up on this commit as the offensive one. ---------------------------- 3f16b9850743b702380f098ab5e0308cd6af1792 is the first bad commit commit 3f16b9850743b702380f098ab5e0308cd6af1792 Author: Dave Chinner <dchinner@xxxxxxxxxx> Date: Tue Dec 21 12:29:01 2010 +1100 xfs: introduce new locks for the log grant ticket wait queues The log grant ticket wait queues are currently protected by the log grant lock. However, the queues are functionally independent from each other, and operations on them only require serialisation against other queue operations now that all of the other log variables they use are atomic values. Hence, we can make them independent of the grant lock by introducing new locks just to protect the lists operations. because the lists are independent, we can use a lock per list and ensure that reserve and write head queuing do not contend. To ensure forced shutdowns work correctly in conjunction with the new fast paths, ensure that we check whether the log has been shut down in the grant functions once we hold the relevant spin locks but before we go to sleep. This is needed to co-ordinate correctly with the wakeups that are issued on the ticket queues so we don't leave any processes sleeping on the queues during a shutdown. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> ------------------------------ Thanks chandra On Fri, 2011-02-25 at 13:06 +1100, Dave Chinner wrote: > On Thu, Feb 24, 2011 at 01:54:40PM -0800, Chandra Seetharaman wrote: > > Hello, > > > > I ran the xfs tests on my POWER box and was tripped by an ASSERT while > > running test 011. I have not seen it before in 2.6.37, so started with a > > git-bisect from 2.6.37 to 2.6.38-rc6 and ended at this commit: > > -------------------- > > Author Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > > Tue, 11 Jan 2011 19:42:06 +0000 (11:42 -0800) > > committer Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > > Tue, 11 Jan 2011 19:42:06 +0000 (11:42 -0800) > > commit 7bc4a4ce68f8c6d064ea949446852e996526f692 > > Merge branch 'for-linus-merged' of git://oss.sgi.com/xfs/xfs > > > > * 'for-linus-merged' of git://oss.sgi.com/xfs/xfs: (47 commits) > > xfs: convert grant head manipulations to lockless algorithm > > xfs: introduce new locks for the log grant ticket wait queues > > xfs: convert log grant heads to atomic variables > > xfs: convert l_tail_lsn to an atomic variable. > > xfs: convert l_last_sync_lsn to an atomic variable > > xfs: make AIL tail pushing independent of the grant lock > > xfs: use wait queues directly for the log wait queues > > xfs: combine grant heads into a single 64 bit integer > > xfs: rework log grant space calculations > > xfs: fact out common grant head/log tail verification code > > xfs: convert log grant ticket queues to list heads > > xfs: use AIL bulk delete function to implement single delete > > xfs: use AIL bulk update function to implement single updates > > xfs: remove all the inodes on a buffer from the AIL in bulk > > xfs: consume iodone callback items on buffers as they are processed > > xfs: reduce the number of AIL push wakeups > > xfs: bulk AIL insertion during transaction commit > > xfs: clean up xfs_ail_delete() > > xfs: Pull EFI/EFD handling out from under the AIL lock > > xfs: fix EFI transaction cancellation. > > ... > > ----------------------------- > > > > Since it included so many patches, I thought i will start a git-bisect > > on xfs git tree at git://oss.sgi.com/xfs/xfs, and ended up at > > 92f1c008ae79e32b83c0607d184b194f302bb3ee, which is the same commit as > > above. > > you landed on the the merge commit. mark that commit as bad, and the > bisect should continue down the merged branch. Otherwise you could > try skipping the merge commit and again the bisect should continue > down the branch. > > > Two questions: > > 1. Has anybody seen this problem ? I see this ASSERT has been added in > > that patch set, is it a false-positive ? > > The ASSERT() being triggered is racy. We are comparing the state of > two different atomic variables that are updated without > synchronisation. Hence after sampling the first, the second could be > changed before it is sampled and hence cause the assert to fail. > > I've never seen it be triggered, so I'm interested to know if this > is reproducable (i.e. a real problem) or whether it is a false > trigger (i.e. update race). > > Cheers, > > Dave. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs