Re: [PATCH] xfs: properly serialise fallocate against AIO+DIO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 29, 2019 at 03:41:33PM +1100, Dave Chinner wrote:
> On Mon, Oct 28, 2019 at 09:19:08PM -0700, Darrick J. Wong wrote:
> > On Tue, Oct 29, 2019 at 02:48:50PM +1100, Dave Chinner wrote:
...
> > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > > index 525b29b99116..865543e41fb4 100644
> > > --- a/fs/xfs/xfs_file.c
> > > +++ b/fs/xfs/xfs_file.c
> > > @@ -817,6 +817,36 @@ xfs_file_fallocate(
> > >  	if (error)
> > >  		goto out_unlock;
> > >  
...
> > > +	/*
> > > +	 * Now AIO and DIO has drained we flush and (if necessary) invalidate
> > > +	 * the cached range over the first operation we are about to run.
> > > +	 *
> > > +	 * We care about zero and collapse here because they both run a hole
> > > +	 * punch over the range first. Because that can zero data, and the range
> > > +	 * of invalidation for the shift operations is much larger, we still do
> > > +	 * the required flush for collapse in xfs_prepare_shift().
> > > +	 *
> > > +	 * Insert has the same range requirements as collapse, and we extend the
> > > +	 * file first which can zero data. Hence insert has the same
> > > +	 * flush/invalidate requirements as collapse and so they are both
> > > +	 * handled at the right time by xfs_prepare_shift().
> > > +	 */
> > > +	if (mode & (FALLOC_FL_PUNCH_HOLE | FALLOC_FL_ZERO_RANGE |
> > > +		    FALLOC_FL_COLLAPSE_RANGE)) {
> > 
> > Er... "Insert has the same requirements as collapse", but we don't test
> > for that here?  Also ... xfs_prepare_shift handles flushing for both
> > collapse and insert range, but we still have to flush here for collapse?
> >
> > <confused but suspecting this has something to do with the fact that we
> > only do insert range after updating the isize?>
> 
> Yes, exactly.
> 
> The flush for collapse here is for the hole punch part of collapse,
> before we start shifting extents. insert does not hole punch, so it
> doesn't need flushing here but it still needs flush/inval before
> shifting. i.e.:
> 
> collapse				insert
> 
> flush_unmap(off, len)
> punch hole(off, len)			extends EOF
>   writes zeros around (off,len)		  writes zeros around EOF
> collapse(off, len)			insert(off, len)
>   flush_unmap(off, EOF)			  flush_unmap(off, EOF)
>   shift extents down			  shift extents up
> 
> So once we start the actual extent shift operation (up or down)
> the flush/unmap requirements are identical.
> 
> > I think the third paragraph of the comment is just confusing me more.
> > Does the following describe what's going on?
> > 
> > "Insert range has the same range [should this be "page cache flushing"?]
> > requirements as collapse.  Because we can zero data as part of extending
> > the file size, we skip the flush here and let the flush in
> > xfs_prepare_shift take care of invalidating the page cache." ?
> 
> It's a bit better - that's kinda what I was trying to describe - but
> I'll try to reword it more clearly after I've let it settle in my
> head for a little while....
> 

I agree the comment is a little confusing because to me, it's just
describing a bit too much for its context. I.e., I read the comment and
have to go look at other code to make sure I grok the comment rather
than the comment helping me grok the code it's associated with.

FWIW, I find something like the following a bit more clear/concise on
the whole:

        /*
+        * Once AIO and DIO has drained we flush and (if necessary) invalidate
+        * the cached range over the first operation we are about to run. We
+        * include zero and collapse here because they both start with a hole
+        * punch over the target range. Insert and collapse both invalidate the
+        * broader range affected by the shift in xfs_prepare_shift().
         */

... because it points out why we special case collapse here, and that
otherwise the prepare shift code is responsible for the rest. Just my
.02 and otherwise the patch looks good to me.

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux