Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter

Oliver Sang <oliver.sang@xxxxxxxxx> · Thu, 26 Sep 2024 10:20:44 +0800

hi, David,

On Tue, Sep 24, 2024 at 10:47:30PM +0100, David Howells wrote:
> Does the attached fix the problem?

yes, as I've replied in
https://lore.kernel.org/all/ZvTD2t5s8lwQyphN@xsang-OptiPlex-9020/

thanks!

> 
> David
> ---
> netfs: Fix write oops in generic/346 (9p) and maybe generic/074 (cifs)
> 
> In netfslib, a buffered writeback operation has a 'write queue' of folios
> that are being written, held in a linear sequence of folio_queue structs.
> The 'issuer' adds new folio_queues on the leading edge of the queue and
> populates each one progressively; the 'collector' pops them off the
> trailing edge and discards them and the folios they point to as they are
> consumed.
> 
> The queue is required to always retain at least one folio_queue structure.
> This allows the queue to be accessed without locking and with just a bit of
> barriering.
> 
> When a new subrequest is prepared, its ->io_iter iterator is pointed at the
> current end of the write queue and then the iterator is extended as more
> data is added to the queue until the subrequest is committed.
> 
> Now, the problem is that the folio_queue at the leading edge of the write
> queue when a subrequest is prepared might have been entirely consumed - but
> not yet removed from the queue as it is the only remaining one and is
> preventing the queue from collapsing.
> 
> So, what happens is that subreq->io_iter is pointed at the spent
> folio_queue, then a new folio_queue is added, and, at that point, the
> collector is at entirely at liberty to immediately delete the spent
> folio_queue.
> 
> This leaves the subreq->io_iter pointing at a freed object.  If the system
> is lucky, iterate_folioq() sees ->io_iter, sees the as-yet uncorrupted
> freed object and advances to the next folio_queue in the queue.
> 
> In the case seen, however, the freed object gets recycled and put back onto
> the queue at the tail and filled to the end.  This confuses
> iterate_folioq() and it tries to step ->next, which may be NULL - resulting
> in an oops.
> 
> Fix this by the following means:
> 
>  (1) When preparing a write subrequest, make sure there's a folio_queue
>      struct with space in it at the leading edge of the queue.  A function
>      to make space is split out of the function to append a folio so that
>      it can be called for this purpose.
> 
>  (2) If the request struct iterator is pointing to a completely spent
>      folio_queue when we make space, then advance the iterator to the newly
>      allocated folio_queue.  The subrequest's iterator will then be set
>      from this.
> 
> Whilst we're at it, also split out the function to allocate a folio_queue,
> initialise it and do the accounting.
> 
> The oops could be triggered using the generic/346 xfstest with a filesystem
> on9P over TCP with cache=loose.  The oops looked something like:
> 
>  BUG: kernel NULL pointer dereference, address: 0000000000000008
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
>  ...
>  RIP: 0010:_copy_from_iter+0x2db/0x530
>  ...
>  Call Trace:
>   <TASK>
>  ...
>   p9pdu_vwritef+0x3d8/0x5d0
>   p9_client_prepare_req+0xa8/0x140
>   p9_client_rpc+0x81/0x280
>   p9_client_write+0xcf/0x1c0
>   v9fs_issue_write+0x87/0xc0
>   netfs_advance_write+0xa0/0xb0
>   netfs_write_folio.isra.0+0x42d/0x500
>   netfs_writepages+0x15a/0x1f0
>   do_writepages+0xd1/0x220
>   filemap_fdatawrite_wbc+0x5c/0x80
>   v9fs_mmap_vm_close+0x7d/0xb0
>   remove_vma+0x35/0x70
>   vms_complete_munmap_vmas+0x11a/0x170
>   do_vmi_align_munmap+0x17d/0x1c0
>   do_vmi_munmap+0x13e/0x150
>   __vm_munmap+0x92/0xd0
>   __x64_sys_munmap+0x17/0x20
>   do_syscall_64+0x80/0xe0
>   entry_SYSCALL_64_after_hwframe+0x71/0x79
> 
> This may also fix a similar-looking issue with cifs and generic/074.
> 
>   | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
>   | Closes: https://lore.kernel.org/oe-lkp/202409180928.f20b5a08-oliver.sang@xxxxxxxxx
> 
> Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
> cc: Eric Van Hensbergen <ericvh@xxxxxxxxxx>
> cc: Latchesar Ionkov <lucho@xxxxxxxxxx>
> cc: Dominique Martinet <asmadeus@xxxxxxxxxxxxx>
> cc: Christian Schoenebeck <linux_oss@xxxxxxxxxxxxx>
> cc: Steve French <sfrench@xxxxxxxxx>
> cc: Paulo Alcantara <pc@xxxxxxxxxxxxx>
> cc: Jeff Layton <jlayton@xxxxxxxxxx>
> cc: v9fs@xxxxxxxxxxxxxxx
> cc: linux-cifs@xxxxxxxxxxxxxxx
> cc: netfs@xxxxxxxxxxxxxxx
> cc: linux-fsdevel@xxxxxxxxxxxxxxx
> ---
>  fs/netfs/internal.h    |    2 +
>  fs/netfs/misc.c        |   72 ++++++++++++++++++++++++++++++++++---------------
>  fs/netfs/objects.c     |   12 ++++++++
>  fs/netfs/write_issue.c |   12 +++++++-
>  4 files changed, 76 insertions(+), 22 deletions(-)
> 
> diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
> index c7f23dd3556a..79c0ad89affb 100644
> --- a/fs/netfs/internal.h
> +++ b/fs/netfs/internal.h
> @@ -58,6 +58,7 @@ static inline void netfs_proc_del_rreq(struct netfs_io_request *rreq) {}
>  /*
>   * misc.c
>   */
> +struct folio_queue *netfs_buffer_make_space(struct netfs_io_request *rreq);
>  int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio,
>  			      bool needs_put);
>  struct folio_queue *netfs_delete_buffer_head(struct netfs_io_request *wreq);
> @@ -76,6 +77,7 @@ void netfs_clear_subrequests(struct netfs_io_request *rreq, bool was_async);
>  void netfs_put_request(struct netfs_io_request *rreq, bool was_async,
>  		       enum netfs_rreq_ref_trace what);
>  struct netfs_io_subrequest *netfs_alloc_subrequest(struct netfs_io_request *rreq);
> +struct folio_queue *netfs_folioq_alloc(struct netfs_io_request *rreq, gfp_t gfp);
>  
>  static inline void netfs_see_request(struct netfs_io_request *rreq,
>  				     enum netfs_rreq_ref_trace what)
> diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c
> index 0ad0982ce0e2..a743e8963247 100644
> --- a/fs/netfs/misc.c
> +++ b/fs/netfs/misc.c
> @@ -9,34 +9,64 @@
>  #include "internal.h"
>  
>  /*
> - * Append a folio to the rolling queue.
> + * Make sure there's space in the rolling queue.
>   */
> -int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio,
> -			      bool needs_put)
> +struct folio_queue *netfs_buffer_make_space(struct netfs_io_request *rreq)
>  {
> -	struct folio_queue *tail = rreq->buffer_tail;
> -	unsigned int slot, order = folio_order(folio);
> +	struct folio_queue *tail = rreq->buffer_tail, *prev;
> +	unsigned int prev_nr_slots = 0;
>  
>  	if (WARN_ON_ONCE(!rreq->buffer && tail) ||
>  	    WARN_ON_ONCE(rreq->buffer && !tail))
> -		return -EIO;
> -
> -	if (!tail || folioq_full(tail)) {
> -		tail = kmalloc(sizeof(*tail), GFP_NOFS);
> -		if (!tail)
> -			return -ENOMEM;
> -		netfs_stat(&netfs_n_folioq);
> -		folioq_init(tail);
> -		tail->prev = rreq->buffer_tail;
> -		if (tail->prev)
> -			tail->prev->next = tail;
> -		rreq->buffer_tail = tail;
> -		if (!rreq->buffer) {
> -			rreq->buffer = tail;
> -			iov_iter_folio_queue(&rreq->io_iter, ITER_SOURCE, tail, 0, 0, 0);
> +		return ERR_PTR(-EIO);
> +
> +	prev = tail;
> +	if (prev) {
> +		if (!folioq_full(tail))
> +			return tail;
> +		prev_nr_slots = folioq_nr_slots(tail);
> +	}
> +
> +	tail = netfs_folioq_alloc(rreq, GFP_NOFS);
> +	if (!tail)
> +		return ERR_PTR(-ENOMEM);
> +	tail->prev = prev;
> +	if (prev)
> +		/* [!] NOTE: After we set prev->next, the consumer is entirely
> +		 * at liberty to delete prev.
> +		 */
> +		WRITE_ONCE(prev->next, tail);
> +
> +	rreq->buffer_tail = tail;
> +	if (!rreq->buffer) {
> +		rreq->buffer = tail;
> +		iov_iter_folio_queue(&rreq->io_iter, ITER_SOURCE, tail, 0, 0, 0);
> +	} else {
> +		/* Make sure we don't leave the master iterator pointing to a
> +		 * block that might get immediately consumed.
> +		 */
> +		if (rreq->io_iter.folioq == prev &&
> +		    rreq->io_iter.folioq_slot == prev_nr_slots) {
> +			rreq->io_iter.folioq = tail;
> +			rreq->io_iter.folioq_slot = 0;
>  		}
> -		rreq->buffer_tail_slot = 0;
>  	}
> +	rreq->buffer_tail_slot = 0;
> +	return tail;
> +}
> +
> +/*
> + * Append a folio to the rolling queue.
> + */
> +int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio,
> +			      bool needs_put)
> +{
> +	struct folio_queue *tail;
> +	unsigned int slot, order = folio_order(folio);
> +
> +	tail = netfs_buffer_make_space(rreq);
> +	if (IS_ERR(tail))
> +		return PTR_ERR(tail);
>  
>  	rreq->io_iter.count += PAGE_SIZE << order;
>  
> diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
> index d32964e8ca5d..dd8241bc996b 100644
> --- a/fs/netfs/objects.c
> +++ b/fs/netfs/objects.c
> @@ -250,3 +250,15 @@ void netfs_put_subrequest(struct netfs_io_subrequest *subreq, bool was_async,
>  	if (dead)
>  		netfs_free_subrequest(subreq, was_async);
>  }
> +
> +struct folio_queue *netfs_folioq_alloc(struct netfs_io_request *rreq, gfp_t gfp)
> +{
> +	struct folio_queue *fq;
> +
> +	fq = kmalloc(sizeof(*fq), gfp);
> +	if (fq) {
> +		netfs_stat(&netfs_n_folioq);
> +		folioq_init(fq);
> +	}
> +	return fq;
> +}
> diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
> index 04e66d587f77..0929d9fd4ce7 100644
> --- a/fs/netfs/write_issue.c
> +++ b/fs/netfs/write_issue.c
> @@ -153,12 +153,22 @@ static void netfs_prepare_write(struct netfs_io_request *wreq,
>  				loff_t start)
>  {
>  	struct netfs_io_subrequest *subreq;
> +	struct iov_iter *wreq_iter = &wreq->io_iter;
> +
> +	/* Make sure we don't point the iterator at a used-up folio_queue
> +	 * struct being used as a placeholder to prevent the queue from
> +	 * collapsing.  In such a case, extend the queue.
> +	 */
> +	if (iov_iter_is_folioq(wreq_iter) &&
> +	    wreq_iter->folioq_slot >= folioq_nr_slots(wreq_iter->folioq)) {
> +		netfs_buffer_make_space(wreq);
> +	}
>  
>  	subreq = netfs_alloc_subrequest(wreq);
>  	subreq->source		= stream->source;
>  	subreq->start		= start;
>  	subreq->stream_nr	= stream->stream_nr;
> -	subreq->io_iter		= wreq->io_iter;
> +	subreq->io_iter		= *wreq_iter;
>  
>  	_enter("R=%x[%x]", wreq->debug_id, subreq->debug_index);
>