Re: [PATCH] 9p: Fix DIO read through netfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Howells wrote on Fri, Aug 09, 2024 at 02:56:09PM +0100:
> From: Dominique Martinet <asmadeus@xxxxxxxxxxxxx>
> 
> 9p: Fix DIO read through netfs

nitpick: now sure how that ended up here but this is duplicated with the
subject (the commit message ends up with this line twice)

> If a program is watching a file on a 9p mount, it won't see any change in
> size if the file being exported by the server is changed directly in the
> source filesystem, presumably because 9p doesn't have change notifications,
> and because netfs skips the reads if the file is empty.
> 
> Fix this by attempting to read the full size specified when a DIO read is
> requested (such as when 9p is operating in unbuffered mode) and dealing
> with a short read if the EOF was less than the expected read.
> 
> To make this work, filesystems using netfslib must not set
> NETFS_SREQ_CLEAR_TAIL if performing a DIO read where that read hit the EOF.
> I don't want to mandatorily clear this flag in netfslib for DIO because,
> say, ceph might make a read from an object that is not completely filled,
> but does not reside at the end of file - and so we need to clear the
> excess.
> 
> This can be tested by watching an empty file over 9p within a VM (such as
> in the ktest framework):
> 
>         while true; do read content; if [ -n "$content" ]; then echo $content; break; fi; done < /host/tmp/foo

(This is basically the same thing but if one wants to control the read
timing for more precise/verbose debugging:
  exec 3< /host/tmp/foo
  read -u 3 content && echo $content
  (repeat as appropriate)
  exec 3>&-
)

> then writing something into the empty file.  The watcher should immediately
> display the file content and break out of the loop.  Without this fix, it
> remains in the loop indefinitely.
> 
> Fixes: 80105ed2fd27 ("9p: Use netfslib read/write_iter")
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218916
> Written-by: Dominique Martinet <asmadeus@xxxxxxxxxxxxx>

Thanks for adding extra comments & fixing other filesystems.

I've checked this covers all cases of setting NETFS_SREQ_CLEAR_TAIL so
hopefully shouldn't have further side effects, this sounds good to me:

Signed-off-by: Dominique Martinet <asmadeus@xxxxxxxxxxxxx>

> Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
> cc: Eric Van Hensbergen <ericvh@xxxxxxxxxx>
> cc: Latchesar Ionkov <lucho@xxxxxxxxxx>
> cc: Christian Schoenebeck <linux_oss@xxxxxxxxxxxxx>
> cc: Marc Dionne <marc.dionne@xxxxxxxxxxxx>
> cc: Ilya Dryomov <idryomov@xxxxxxxxx>
> cc: Steve French <sfrench@xxxxxxxxx>
> cc: Paulo Alcantara <pc@xxxxxxxxxxxxx>
> cc: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> cc: v9fs@xxxxxxxxxxxxxxx
> cc: linux-afs@xxxxxxxxxxxxxxxxxxx
> cc: ceph-devel@xxxxxxxxxxxxxxx
> cc: linux-cifs@xxxxxxxxxxxxxxx
> cc: linux-nfs@xxxxxxxxxxxxxxx
> cc: netfs@xxxxxxxxxxxxxxx
> cc: linux-fsdevel@xxxxxxxxxxxxxxx
> ---
>  fs/9p/vfs_addr.c     |    3 ++-
>  fs/afs/file.c        |    3 ++-
>  fs/ceph/addr.c       |    6 ++++--
>  fs/netfs/io.c        |   17 +++++++++++------
>  fs/nfs/fscache.c     |    3 ++-
>  fs/smb/client/file.c |    3 ++-
>  6 files changed, 23 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
> index a97ceb105cd8..24fdc74caeba 100644
> --- a/fs/9p/vfs_addr.c
> +++ b/fs/9p/vfs_addr.c
> @@ -75,7 +75,8 @@ static void v9fs_issue_read(struct netfs_io_subrequest *subreq)
>  
>  	/* if we just extended the file size, any portion not in
>  	 * cache won't be on server and is zeroes */
> -	__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
> +	if (subreq->rreq->origin != NETFS_DIO_READ)
> +		__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
>  
>  	netfs_subreq_terminated(subreq, err ?: total, false);
>  }
> diff --git a/fs/afs/file.c b/fs/afs/file.c
> index c3f0c45ae9a9..ec1be0091fdb 100644
> --- a/fs/afs/file.c
> +++ b/fs/afs/file.c
> @@ -242,7 +242,8 @@ static void afs_fetch_data_notify(struct afs_operation *op)
>  
>  	req->error = error;
>  	if (subreq) {
> -		__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
> +		if (subreq->rreq->origin != NETFS_DIO_READ)
> +			__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
>  		netfs_subreq_terminated(subreq, error ?: req->actual_len, false);
>  		req->subreq = NULL;
>  	} else if (req->done) {
> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> index cc0a2240de98..c4744a02db75 100644
> --- a/fs/ceph/addr.c
> +++ b/fs/ceph/addr.c
> @@ -246,7 +246,8 @@ static void finish_netfs_read(struct ceph_osd_request *req)
>  	if (err >= 0) {
>  		if (sparse && err > 0)
>  			err = ceph_sparse_ext_map_end(op);
> -		if (err < subreq->len)
> +		if (err < subreq->len &&
> +		    subreq->rreq->origin != NETFS_DIO_READ)
>  			__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
>  		if (IS_ENCRYPTED(inode) && err > 0) {
>  			err = ceph_fscrypt_decrypt_extents(inode,
> @@ -282,7 +283,8 @@ static bool ceph_netfs_issue_op_inline(struct netfs_io_subrequest *subreq)
>  	size_t len;
>  	int mode;
>  
> -	__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
> +	if (rreq->origin != NETFS_DIO_READ)
> +		__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
>  	__clear_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags);
>  
>  	if (subreq->start >= inode->i_size)
> diff --git a/fs/netfs/io.c b/fs/netfs/io.c
> index c179a1c73fa7..5367caf3fa28 100644
> --- a/fs/netfs/io.c
> +++ b/fs/netfs/io.c
> @@ -530,7 +530,8 @@ void netfs_subreq_terminated(struct netfs_io_subrequest *subreq,
>  
>  	if (transferred_or_error == 0) {
>  		if (__test_and_set_bit(NETFS_SREQ_NO_PROGRESS, &subreq->flags)) {
> -			subreq->error = -ENODATA;
> +			if (rreq->origin != NETFS_DIO_READ)
> +				subreq->error = -ENODATA;
>  			goto failed;
>  		}
>  	} else {
> @@ -601,9 +602,14 @@ netfs_rreq_prepare_read(struct netfs_io_request *rreq,
>  			}
>  			if (subreq->len > ictx->zero_point - subreq->start)
>  				subreq->len = ictx->zero_point - subreq->start;
> +
> +			/* We limit buffered reads to the EOF, but let the
> +			 * server deal with larger-than-EOF DIO/unbuffered
> +			 * reads.
> +			 */
> +			if (subreq->len > rreq->i_size - subreq->start)
> +				subreq->len = rreq->i_size - subreq->start;
>  		}
> -		if (subreq->len > rreq->i_size - subreq->start)
> -			subreq->len = rreq->i_size - subreq->start;
>  		if (rreq->rsize && subreq->len > rreq->rsize)
>  			subreq->len = rreq->rsize;
>  
> @@ -739,11 +745,10 @@ int netfs_begin_read(struct netfs_io_request *rreq, bool sync)
>  	do {
>  		_debug("submit %llx + %llx >= %llx",
>  		       rreq->start, rreq->submitted, rreq->i_size);
> -		if (rreq->origin == NETFS_DIO_READ &&
> -		    rreq->start + rreq->submitted >= rreq->i_size)
> -			break;
>  		if (!netfs_rreq_submit_slice(rreq, &io_iter))
>  			break;
> +		if (test_bit(NETFS_SREQ_NO_PROGRESS, &rreq->flags))
> +			break;
>  		if (test_bit(NETFS_RREQ_BLOCKED, &rreq->flags) &&
>  		    test_bit(NETFS_RREQ_NONBLOCK, &rreq->flags))
>  			break;
> diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> index bf29a65c5027..7a558dea75c4 100644
> --- a/fs/nfs/fscache.c
> +++ b/fs/nfs/fscache.c
> @@ -363,7 +363,8 @@ void nfs_netfs_read_completion(struct nfs_pgio_header *hdr)
>  		return;
>  
>  	sreq = netfs->sreq;
> -	if (test_bit(NFS_IOHDR_EOF, &hdr->flags))
> +	if (test_bit(NFS_IOHDR_EOF, &hdr->flags) &&
> +	    sreq->rreq->origin != NETFS_DIO_READ)
>  		__set_bit(NETFS_SREQ_CLEAR_TAIL, &sreq->flags);
>  
>  	if (hdr->error)
> diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
> index b2405dd4d4d4..3f3842e7b44a 100644
> --- a/fs/smb/client/file.c
> +++ b/fs/smb/client/file.c
> @@ -217,7 +217,8 @@ static void cifs_req_issue_read(struct netfs_io_subrequest *subreq)
>  			goto out;
>  	}
>  
> -	__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
> +	if (subreq->rreq->origin != NETFS_DIO_READ)
> +		__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
>  
>  	rc = rdata->server->ops->async_readv(rdata);
>  out:
> 

-- 
Dominique Martinet | Asmadeus




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux