Re: [PATCH] xfs: add missing ilock around dio write last extent alignment

Brian Foster <bfoster@xxxxxxxxxx> · Thu, 17 Sep 2015 08:16:06 -0400

On Thu, Sep 17, 2015 at 08:34:35AM +1000, Dave Chinner wrote:
> On Mon, Sep 14, 2015 at 09:24:55AM -0400, Brian Foster wrote:
> > On Mon, Sep 14, 2015 at 09:58:35AM +1000, Dave Chinner wrote:
> > > On Wed, Sep 09, 2015 at 10:43:32AM -0400, Brian Foster wrote:
...
> > > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> > > > index 1f86033..4d7534e 100644
> > > > --- a/fs/xfs/xfs_iomap.c
> > > > +++ b/fs/xfs/xfs_iomap.c
> > > > @@ -142,7 +142,9 @@ xfs_iomap_write_direct(
> > > >  	offset_fsb = XFS_B_TO_FSBT(mp, offset);
> > > >  	last_fsb = XFS_B_TO_FSB(mp, ((xfs_ufsize_t)(offset + count)));
> > > >  	if ((offset + count) > XFS_ISIZE(ip)) {
> > > > +		xfs_ilock(ip, XFS_ILOCK_EXCL);
> > > >  		error = xfs_iomap_eof_align_last_fsb(mp, ip, extsz, &last_fsb);
> > > > +		xfs_iunlock(ip, XFS_ILOCK_EXCL);
> > > 
> > > XFS_ILOCK_SHARED?
> > > 
> > 
> > I suspect that is technically sufficient in this particular call path
> > given that we've called xfs_bmapi_read(). The problem is that there is a
> > call to xfs_iread_extents() buried a few calls deep in
> > xfs_bmap_last_extent().
> 
> Sure.
> 
> > My understanding is that we need the exclusive
> > lock because it's not safe for multiple threads to populate the in-core
> > extent list at the same time, so I don't really want to replace the
> > existing race with a landmine should the context happen to change in the
> > future.
> 
> yes, but that can't happen here because we are guaranteed to have
> the extent list in memory because we've alreay called
> xfs_bmapi_read() and that will populate the extent list with the
> appropriate lock held.
> 
> > > Also, looking at __xfs_get_blocks(), we drop the ilock immediately
> > > before calling xfs_iomap_write_direct(), which we already hold in
> > > shared mode for the xfs_bmapi_read() for direct IO.
> > > 
> > > Can we push that lock dropping into xfs_iomap_write_direct() after
> > > we've done the xfs_iomap_eof_align_last_fsb() call and before we do
> > > transaction reservations so we don't need an extra lock round-trip
> > > here? e.g. xfs_iomap_write_delay() is called under the lock context
> > > held by __xfs_get_blocks()....
> > > 
> > 
> > That was my initial thought when looking at this code... e.g., to just
> > carry the lock over and drop it prior to transaction setup. I didn't go
> > that route because __xfs_get_blocks() uses a variable locking mode and
> > it seemed ugly to pass along the lock mode to xfs_iomap_direct_write().
> > Further, given the above it also looked like we'd have to check and
> > cycle the ilock EXCL if it were ILOCK_SHARED. Finally,
> 
> No, because the __xfs_get_blocks code calls
> xfs_ilock_data_map_shared() for direct IO, so already holds the
> correct lock for populating the extent list (not that this matters
> here).
> 
> > xfs_iomap_direct_write() has a call to xfs_qm_dqattach() which itself
> > acquires ILOCK_EXCL. Looking at xfs_iomap_write_delay(), we do have a
> > dqattach_locked() variant but it also expects to have ILOCK_EXCL.
> 
> That can be moved to after we've calculated the last extent. i.e.
> to just before we start the transaction....
> 
> > The only thing I'm not sure about is the shared lock safe version of
> > xfs_iomap_eof_align_last_fsb().
> 
> All the callers are guaranteed to have first populated the extent
> list, so this should be safe. If you are really worried, add an
> assert that verifies either ILOCK_EXCL or (ILOCK_SHARED && extents
> read in)
> 
> > xfs_iread_extents() if it were called). Also, I take it we can safely
> > assume the in-core extent list is still around if we still hold the lock
> > from the xfs_bmapi_read() call. Thoughts? I guess I'll float another
> > patch...
> 
> Once the extents are read in, they are in memory until the inode is
> reclaimed. That won't happen while we have active references to it.
> :)
> 

Yeah, there's a v2 on the list that pretty much matches what is
suggested here. My main concern with the xfs_iread_extents() thing was
the landmine nature of it (after all, it was obfuscated enough that it
wasn't obvious locking was even necessary). I've addressed that with a
comment and asserts at the callsite. The only difference is I made it
such that xfs_iomap_write_direct() expects ILOCK_SHARED rather than take
a variable lockflag parameter, but we could do that either way. Thanks
for the comments.

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs