Re: [PATCH 1/2] sunrpc/cache: remove races with queuing an upcall.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05 Mar 2013 16:07:39 +0100 Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx>
wrote:

> Hi,
> 
> the patch in my previous mail was a bit too optimized,
> in that it gets an cache_request* pointer from each entry
> on the detail->queue, even if it really is a cache_reader.
> 
> As the resulting pointer is used as a cache_request* after
> checking cache_queue->reader only, this isn't a bug, but
> it is unclean.
> 
> Thus I changed the patch below once again to keep the sane
> handling as it has been implemented.
> 
> Bodo
> 
> 
> On 04 Mar 2013 07:10:00 +0100 NeilBrown <neilb@xxxxxxx> wrote:
> 
> > We currently queue an upcall after setting CACHE_PENDING,
> > and dequeue after clearing CACHE_PENDING.
> > So a request should only be present when CACHE_PENDING is set.
> > 
> > However we don't combine the test and the enqueue/dequeue in
> > a protected region, so it is possible (if unlikely) for a race
> > to result in a request being queued without CACHE_PENDING set,
> > or a request to be absent despite CACHE_PENDING.
> > 
> > So: include a test for CACHE_PENDING inside the regions of
> > enqueue and dequeue where queue_lock is held, and abort
> > the operation if the value is not as expected.
> > 
> > Also remove the 'return' from cache_dequeue() to ensure it always
> > removes all entries:  As there is no locking between setting
> > CACHE_PENDING and calling sunrpc_cache_pipe_upcall it is not
> > inconceivable for some other thread to clear CACHE_PENDING and then
> > someone else to set it can call sunrpc_cache_pipe_upcall, both before
> > the original thread completed the call.
> > 
> > With this, it perfectly safe and correct to:
> >  - call cache_dequeue() if and only if we have just
> >    cleared CACHE_PENDING
> >  - call sunrpc_cache_pipe_upcall() (via cache_make_upcall)
> >    if and only if we have just set CACHE_PENDING.
> > 
> > Reported-by: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx>
> > Signed-off-by: NeilBrown <neilb@xxxxxxx>
> > ---
> >  net/sunrpc/cache.c |   17 ++++++++++++++---
> >  1 file changed, 14 insertions(+), 3 deletions(-)
> > 
> > diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> > index 9afa439..0400a92 100644
> > --- a/net/sunrpc/cache.c
> > +++ b/net/sunrpc/cache.c
> > @@ -1022,6 +1022,9 @@ static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch)
> >  			struct cache_request *cr = container_of(cq, struct cache_request, q);
> >  			if (cr->item != ch)
> >  				continue;
> > +			if (test_bit(CACHE_PENDING, &ch->flags))
> > +				/* Lost a race and it is pending again */
> > +				break;
> >  			if (cr->readers != 0)
> >  				continue;
> >  			list_del(&cr->q.list);
> > @@ -1029,7 +1032,6 @@ static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch)
> >  			cache_put(cr->item, detail);
> >  			kfree(cr->buf);
> >  			kfree(cr);
> > -			return;
> 
> Just removing "return" is not enough. If the loop no longer stops
> after dequeueing the first entry found, some other changes are
> necessary also:
> 
> 1) use list_for_each_entry_safe() instead of list_for_each_entry()
> 
> 2) don't call spin_unlock() in the loop.
> 
> Thus, at the end of this mail I added a revised patch. With this
> patch cache_dequeue() no longer frees the requests in the loop,
> but moves them to a temporary queue.
> After the loop it calls spin_unlock() and does the kfree() and
> cache_put() in a second loop.
> 
> The patch is not tested on a mainline kernel. Instead, I
> ported it to SLES11 SP1 2.6.32.59-0.7.1. On SLES 11 the test
> is still running fine.
> 
> Best Regards,
> Bodo
> 
> >  		}
> >  	spin_unlock(&queue_lock);
> >  }
> > @@ -1151,6 +1153,7 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h,
> >  	struct cache_request *crq;
> >  	char *bp;
> >  	int len;
> > +	int ret = 0;
> >  
> >  	if (!cache_listeners_exist(detail)) {
> >  		warn_no_listener(detail);
> > @@ -1182,10 +1185,18 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h,
> >  	crq->len = PAGE_SIZE - len;
> >  	crq->readers = 0;
> >  	spin_lock(&queue_lock);
> > -	list_add_tail(&crq->q.list, &detail->queue);
> > +	if (test_bit(CACHE_PENDING, &h->flags))
> > +		list_add_tail(&crq->q.list, &detail->queue);
> > +	else
> > +		/* Lost a race, no longer PENDING, so don't enqueue */
> > +		ret = -EAGAIN;
> >  	spin_unlock(&queue_lock);
> >  	wake_up(&queue_wait);
> > -	return 0;
> > +	if (ret == -EAGAIN) {
> > +		kfree(buf);
> > +		kfree(crq);
> > +	}
> > +	return ret;
> >  }
> >  EXPORT_SYMBOL_GPL(sunrpc_cache_pipe_upcall);
> >  
> > 
> 
> 
> .....
> 
> Reported-by: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx>
> Signed-off-by: NeilBrown <neilb@xxxxxxx>
> Signed-off-by: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx>
> ---
> 
> --- a/net/sunrpc/cache.c	2013-02-20 14:05:08.000000000 +0100
> +++ b/net/sunrpc/cache.c	2013-03-05 13:30:13.000000000 +0100
> @@ -1015,23 +1015,33 @@
>  
>  static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch)
>  {
> -	struct cache_queue *cq;
> +	struct cache_queue *cq, *tmp;
> +	struct cache_request *cr;
> +	struct list_head dequeued;
> +
> +	INIT_LIST_HEAD(&dequeued);
>  	spin_lock(&queue_lock);
> -	list_for_each_entry(cq, &detail->queue, list)
> +	list_for_each_entry_safe(cq, tmp, &detail->queue, list)
>  		if (!cq->reader) {
> -			struct cache_request *cr = container_of(cq, struct cache_request, q);
> +			cr = container_of(cq, struct cache_request, q);
>  			if (cr->item != ch)
>  				continue;
> +			if (test_bit(CACHE_PENDING, &ch->flags))
> +				/* Lost a race and it is pending again */
> +				break;
>  			if (cr->readers != 0)
>  				continue;
> -			list_del(&cr->q.list);
> -			spin_unlock(&queue_lock);
> -			cache_put(cr->item, detail);
> -			kfree(cr->buf);
> -			kfree(cr);
> -			return;
> +			list_move(&cr->q.list, &dequeued);
>  		}
>  	spin_unlock(&queue_lock);
> +
> +	while (!list_empty(&dequeued)) {
> +		cr = list_entry(dequeued.next, struct cache_request, q.list);
> +		list_del(&cr->q.list);
> +		cache_put(cr->item, detail);
> +		kfree(cr->buf);
> +		kfree(cr);
> +	}
>  }
>  
>  /*
> @@ -1151,6 +1161,7 @@
>  	struct cache_request *crq;
>  	char *bp;
>  	int len;
> +	int ret = 0;
>  
>  	if (!cache_listeners_exist(detail)) {
>  		warn_no_listener(detail);
> @@ -1182,10 +1193,18 @@
>  	crq->len = PAGE_SIZE - len;
>  	crq->readers = 0;
>  	spin_lock(&queue_lock);
> -	list_add_tail(&crq->q.list, &detail->queue);
> +	if (test_bit(CACHE_PENDING, &h->flags))
> +		list_add_tail(&crq->q.list, &detail->queue);
> +	else
> +		/* Lost a race, no longer PENDING, so don't enqueue */
> +		ret = -EAGAIN;
>  	spin_unlock(&queue_lock);
>  	wake_up(&queue_wait);
> -	return 0;
> +	if (ret == -EAGAIN) {
> +		kfree(buf);
> +		kfree(crq);
> +	}
> +	return ret;
>  }
>  EXPORT_SYMBOL_GPL(sunrpc_cache_pipe_upcall);
>  


Thanks for this - that was careless of me.  sorry.

Your patch looks good.  I'll forward the two to Bruce again shortly, then
look at submitting a backport for SLES.

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux