Re: [PATCH 7/7] nfs: page cache invalidation for dio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 14 Nov 2013 08:50:34 -0800
Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:

> Make sure to properly invalidate the pagecache before performing direct I/O,
> so that no stale pages are left around.  This matches what the generic
> direct I/O code does.  Also take the i_mutex over the direct write submission
> to avoid the lifelock vs truncate waiting for i_dio_count to decrease, and
> to avoid having the pagecache easily repopulated while direct I/O is in
> progrss.  Again matching the generic direct I/O code.
> 
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> ---
>  fs/nfs/direct.c |   29 +++++++++++++++++++++++++++--
>  1 file changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
> index 6cc7fe1..2b778fc 100644
> --- a/fs/nfs/direct.c
> +++ b/fs/nfs/direct.c
> @@ -939,9 +939,12 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, const struct iovec *iov,
>  	struct inode *inode = mapping->host;
>  	struct nfs_direct_req *dreq;
>  	struct nfs_lock_context *l_ctx;
> +	loff_t end;
>  	size_t count;
>  
>  	count = iov_length(iov, nr_segs);
> +	end = (pos + count - 1) >> PAGE_CACHE_SHIFT;
> +
>  	nfs_add_stats(mapping->host, NFSIOS_DIRECTWRITTENBYTES, count);
>  
>  	dfprintk(FILE, "NFS: direct write(%pD2, %zd@%Ld)\n",
> @@ -958,16 +961,25 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, const struct iovec *iov,
>  	if (!count)
>  		goto out;
>  
> +	mutex_lock(&inode->i_mutex);
> +
>  	result = nfs_sync_mapping(mapping);
>  	if (result)
> -		goto out;
> +		goto out_unlock;
> +
> +	if (mapping->nrpages) {
> +		result = invalidate_inode_pages2_range(mapping,
> +					pos >> PAGE_CACHE_SHIFT, end);
> +		if (result)
> +			goto out_unlock;
> +	}
>  
>  	task_io_account_write(count);
>  
>  	result = -ENOMEM;
>  	dreq = nfs_direct_req_alloc();
>  	if (!dreq)
> -		goto out;
> +		goto out_unlock;
>  
>  	dreq->inode = inode;
>  	dreq->bytes_left = count;
> @@ -982,6 +994,14 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, const struct iovec *iov,
>  		dreq->iocb = iocb;
>  
>  	result = nfs_direct_write_schedule_iovec(dreq, iov, nr_segs, pos, uio);
> +	
> +	if (mapping->nrpages) {
> +		invalidate_inode_pages2_range(mapping,
> +					      pos >> PAGE_CACHE_SHIFT, end);
> +	}
> +
> +	mutex_unlock(&inode->i_mutex);
> +
>  	if (!result) {
>  		result = nfs_direct_wait(dreq);
>  		if (result > 0) {
> @@ -994,8 +1014,13 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, const struct iovec *iov,
>  			spin_unlock(&inode->i_lock);
>  		}
>  	}
> +	nfs_direct_req_release(dreq);
> +	return result;
> +
>  out_release:
>  	nfs_direct_req_release(dreq);
> +out_unlock:
> +	mutex_unlock(&inode->i_mutex);
>  out:
>  	return result;
>  }

Hrm... I started chasing down a bug reported by our QA group last week
that's showing up when you mix DIO writes and buffered reads
(basically, diotest3 in the LTP suite is failing). The bug is marked
private for dumb reasons but I'll see if I can make it public. I'll
also plan to give this series a spin to see if it helps fix that bug...

In any case, the DIO write code calls nfs_zap_mapping after it gets the
WRITE reply. That sets NFS_INO_INVALID_DATA and should prevent buffered
read() calls from getting data out of the cache after the write reply
comes in.

Why is that not sufficient here?

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux