Re: [PATCH] fs: Avoid invalidation in interrupt context in dio_complete()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 17, 2017 at 11:53:56AM +0200, Jan Kara wrote:
> On Tue 17-10-17 11:11:24, Lukas Czerner wrote:
> > Currently we try to defer completion of async DIO to the process context
> > in case there are any mapped pages associated with the inode so that we
> > can invalidate the pages when the IO completes. However the check is racy
> > and the pages can be mapped afterwards. If this happens we might end up
> > calling invalidate_inode_pages2_range() in dio_complete() in interrupt
> > context which could sleep. This can be reproduced by generic/451.
> > 
> > Fix this by passing the information whether we can or can't invalidate
> > to the dio_complete(). Thanks Eryu Guan for reporting this and Jan Kara
> > for suggesting a fix.
> > 
> > Fixes: 332391a9935d ("fs: Fix page cache inconsistency when mixing buffered and AIO DIO")
> > Signed-off-by: Lukas Czerner <lczerner@xxxxxxxxxx>
> > Reported-by: Eryu Guan <eguan@xxxxxxxxxx>
> 
> Looks good to me. You can add:
> 
> Reviewed-by: Jan Kara <jack@xxxxxxx>
> 
> BTW, I guess you should CC someone who can merge the patch...

Jens, could you please take this through your tree ?

Thanks!
-Lukas

> 
> 								Honza
> 
> > ---
> >  fs/direct-io.c | 19 +++++++++++++------
> >  1 file changed, 13 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/direct-io.c b/fs/direct-io.c
> > index 96415c6..5632548 100644
> > --- a/fs/direct-io.c
> > +++ b/fs/direct-io.c
> > @@ -45,6 +45,12 @@
> >  #define DIO_PAGES	64
> >  
> >  /*
> > + * Flags for dio_complete()
> > + */
> > +#define DIO_COMPLETE_ASYNC		0x01	/* This is async IO */
> > +#define DIO_COMPLETE_INVALIDATE		0x02	/* Can invalidate pages */
> > +
> > +/*
> >   * This code generally works in units of "dio_blocks".  A dio_block is
> >   * somewhere between the hard sector size and the filesystem block size.  it
> >   * is determined on a per-invocation basis.   When talking to the filesystem
> > @@ -225,7 +231,7 @@ static inline struct page *dio_get_page(struct dio *dio,
> >   * filesystems can use it to hold additional state between get_block calls and
> >   * dio_complete.
> >   */
> > -static ssize_t dio_complete(struct dio *dio, ssize_t ret, bool is_async)
> > +static ssize_t dio_complete(struct dio *dio, ssize_t ret, unsigned int flags)
> >  {
> >  	loff_t offset = dio->iocb->ki_pos;
> >  	ssize_t transferred = 0;
> > @@ -266,7 +272,8 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, bool is_async)
> >  	 * one is a pretty crazy thing to do, so we don't support it 100%.  If
> >  	 * this invalidation fails, tough, the write still worked...
> >  	 */
> > -	if (ret > 0 && dio->op == REQ_OP_WRITE &&
> > +	if (flags & DIO_COMPLETE_INVALIDATE &&
> > +	    ret > 0 && dio->op == REQ_OP_WRITE &&
> >  	    dio->inode->i_mapping->nrpages) {
> >  		err = invalidate_inode_pages2_range(dio->inode->i_mapping,
> >  					offset >> PAGE_SHIFT,
> > @@ -285,7 +292,7 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, bool is_async)
> >  	if (!(dio->flags & DIO_SKIP_DIO_COUNT))
> >  		inode_dio_end(dio->inode);
> >  
> > -	if (is_async) {
> > +	if (flags & DIO_COMPLETE_ASYNC) {
> >  		/*
> >  		 * generic_write_sync expects ki_pos to have been updated
> >  		 * already, but the submission path only does this for
> > @@ -306,7 +313,7 @@ static void dio_aio_complete_work(struct work_struct *work)
> >  {
> >  	struct dio *dio = container_of(work, struct dio, complete_work);
> >  
> > -	dio_complete(dio, 0, true);
> > +	dio_complete(dio, 0, DIO_COMPLETE_ASYNC | DIO_COMPLETE_INVALIDATE);
> >  }
> >  
> >  static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio);
> > @@ -348,7 +355,7 @@ static void dio_bio_end_aio(struct bio *bio)
> >  			queue_work(dio->inode->i_sb->s_dio_done_wq,
> >  				   &dio->complete_work);
> >  		} else {
> > -			dio_complete(dio, 0, true);
> > +			dio_complete(dio, 0, DIO_COMPLETE_ASYNC);
> >  		}
> >  	}
> >  }
> > @@ -1360,7 +1367,7 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
> >  		dio_await_completion(dio);
> >  
> >  	if (drop_refcount(dio) == 0) {
> > -		retval = dio_complete(dio, retval, false);
> > +		retval = dio_complete(dio, retval, DIO_COMPLETE_INVALIDATE);
> >  	} else
> >  		BUG_ON(retval != -EIOCBQUEUED);
> >  
> > -- 
> > 2.7.5
> > 
> -- 
> Jan Kara <jack@xxxxxxxx>
> SUSE Labs, CR



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux