On Mon, Feb 04, 2013 at 10:13:11AM -0600, Alex Elder wrote: > In xfs_ifunlock() there is a call to wake_up_bit() after clearing > the flush lock on the xfs inode. This is not guaranteed to be safe, > as noted in the comments above wake_up_bit() beginning with: > > In order for this to function properly, as it uses > waitqueue_active() internally, some kind of memory > barrier must be done prior to calling this. > > I claim no mastery of the details and subtlety of memory barrier > use, but I believe the issue is that the call to waitqueue_active() > in __wake_up_bit(), could be operating on a value of "wq" that is > out of date. This patch fixes this by inserting a call to smp_mb() > in xfs_iunlock before calling wake_up_bit(), along the lines of > what's done in unlock_new_inode(). A litte more explanation > follows. > > > In __xfs_iflock(), prepare_to_wait_exclusive() adds a wait queue > entry to the end of a bit wait queue before setting the current task > state to UNINTERRUPTIBLE. And although setting the task state > issues a full smp_mb() (which ensures changes made are visible to > the rest of the system at that point) that alone does not guarantee > that other CPUs will instantly avail themselves of the updated > value. A separate CPU needs to issue at least a read barrier in > order to ensure the wq value it uses to determine whether there are > waiters is up-to-date, and waitqueue_active() does not do that. You can probably trim most of this and simply point at the comment describing wake_up_bit().... > I came to suspect this code because we had a customer with a system > that was hung with one or more tasks stuck in __xfs_iflock(). A > little poking around the affected code led me to the comments in > wake_up_bit(). > > Signed-off-by: Alex Elder <elder@xxxxxxxxxxx> > --- > fs/xfs/xfs_inode.h | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h > index 22baf6e..237e7f6 100644 > --- a/fs/xfs/xfs_inode.h > +++ b/fs/xfs/xfs_inode.h > @@ -419,6 +419,7 @@ static inline void xfs_iflock(struct xfs_inode *ip) > static inline void xfs_ifunlock(struct xfs_inode *ip) > { > xfs_iflags_clear(ip, XFS_IFLOCK); > + smp_mb(); > wake_up_bit(&ip->i_flags, __XFS_IFLOCK_BIT); ACK, smp_mb() is needed because spin_unlock() is not a memory barrier and so not everyone will have seen the bit being cleared. Reviewed-by: Dave Chinner <david@xxxxxxxxxxxxx> Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs