Re: [PATCH] mm: add missing mutex lock arround notify_change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 19, 2011 at 12:43:43PM +1100, Dave Chinner wrote:
> > We have a shitload of deadlocks on very common paths with that patch.  What
> > of the paths that do lead to file_remove_suid() without i_mutex?
> > *	xfs_file_aio_write_checks(): we drop i_mutex (via xfs_rw_iunlock())
> > just before calling file_remove_suid().  Racy, the fix is obvious - move
> > file_remove_suid() call before unlocking.
> 
> Not exactly. xfs_rw_iunlock() is not doing what you think it's doing
> there.....

Huh?  It is called as 

> > -	xfs_rw_iunlock(ip, XFS_ILOCK_EXCL);

and thus in
static inline void
xfs_rw_iunlock(
        struct xfs_inode        *ip,
        int                     type)
{
        xfs_iunlock(ip, type);
        if (type & XFS_IOLOCK_EXCL)
                mutex_unlock(&VFS_I(ip)->i_mutex);
}
we are guaranteed to hit i_mutex.  

> Wrong lock.  That's dropping the internal XFS inode metadata lock,
> but the VFS i_mutex is associated with the internal XFS inode IO
> lock, which is accessed via XFS_IOLOCK_*. Only if we take the iolock
> via XFS_IOLOCK_EXCL do we actually take the i_mutex.

> Now it gets complex. For buffered IO, we are guaranteed to already
> be holding the i_mutex because we do:
> 
>         *iolock = XFS_IOLOCK_EXCL;
>         xfs_rw_ilock(ip, *iolock);
> 
>         ret = xfs_file_aio_write_checks(file, &pos, &count, new_size, iolock);
> 
> So that is safe and non-racy right now.

No, it is not - we *drop* it before calling file_remove_suid().  Explicitly.
Again, look at that xfs_rw_iunlock() call there - it does drop i_mutex
(which is to say, you'd better have taken it prior to that, or you have
far worse problems).

> For direct IO, however, we don't always take the IOLOCK exclusively.
> Indeed, we try really, really hard not to do this so we can do
> concurrent reads and writes to the inode, and that results
> in a bunch of lock juggling when we actually need the IOLOCK
> exclusive (like in xfs_file_aio_write_checks()). It sounds like we
> need to know if we are going to have to remove the SUID bit ahead of
> time so that we can  take the correct lock up front. I haven't
> looked at what is needed to do that yet.

OK, I'm definitely missing something.  The very first thing
xfs_file_aio_write_checks() does is
        xfs_rw_ilock(ip, XFS_ILOCK_EXCL);
which really makes me wonder how the hell does that manage to avoid an
instant deadlock in case of call via xfs_file_buffered_aio_write()
where we have:
        struct address_space    *mapping = file->f_mapping;
        struct inode            *inode = mapping->host;
        struct xfs_inode        *ip = XFS_I(inode);
        *iolock = XFS_IOLOCK_EXCL;
        xfs_rw_ilock(ip, *iolock);
        ret = xfs_file_aio_write_checks(file, &pos, &count, new_size, iolock);
which leads to
        struct inode            *inode = file->f_mapping->host;
        struct xfs_inode        *ip = XFS_I(inode);
(IOW, inode and ip are the same as in the caller) followed by
        xfs_rw_ilock(ip, XFS_ILOCK_EXCL);
and with both xfs_rw_ilock() calls turning into
	mutex_lock(&VFS_I(ip)->i_mutex);
        xfs_ilock(ip, XFS_ILOCK_EXCL);
we ought to deadlock on that i_mutex.  What am I missing and how do we manage
to survive that?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]