On 26 Feb 2013 07:37:00 +0100 NeilBrown <neilb@xxxxxxx> wrote: > We currently queue an upcall after setting CACHE_PENDING, and dequeue after clearing CACHE_PENDING. > So a request should only be present when CACHE_PENDING is set. > > However we don't combine the test and the enqueue/dequeue in a protected region, so it is possible (if unlikely) for a race to result in a request being queued without CACHE_PENDING set, or a request to be absent despite CACHE_PENDING. > > So: include a test for CACHE_PENDING inside the regions of enqueue and dequeue where queue_lock is held, and abort the operation if the value is not as expected. > > With this, it perfectly safe and correct to: > - call cache_dequeue() if and only if we have just > cleared CACHE_PENDING > - call sunrpc_cache_pipe_upcall() (via cache_make_upcall) > if and only if we have just set CACHE_PENDING. > > Reported-by: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx> > Signed-off-by: NeilBrown <neilb@xxxxxxx> Sorry, I don't agree with this patch, as it fixes the first scenario of my mail from 24 Feb 2013, but AFAICS changes the second one (which has been a minor point that didn't need fixing necessarily) to a memory leak. I'll try to expain my doubts: Again, assume two threads calling cache_check() concurrently for the same cache entry of a cache that has a reader. Both threads get result -EAGAIN from cache_is_valid(). The second thread at that moment is interrupted and suspended for a while. The first thread sets CACHE_PENDING and queues an upcall request and sleeps waiting for the reply. The reader reads the request and replies accordingly. In sunrpc_cache_update() the reader changes the entry to CACHE_VALID and calls cache_fresh_unlocked(). In cache_fresh_unlocked() it resets CACHE_PENDING. At this moment it is interrupted and suspended. Now the second thread wakes up, sets CACHE_PENDING again and queues a new upcall request. The reader wakes up and sees, that CACHE_PENDING is set again and does not dequeue the old request. --> memory leak (?) In my opinion, there are two possible fixes that could replace this patch: 1) Don't call cache_dequeue() from cache_check(). Trying to dequeue something even if we know, that we haven't queued, looks strange to me. (And yes, I understand the reason, why you don't like it, but nevertheless I consider this the best solution.) This one would fix my first scenariop only, but not the second. 2) I think, the starting point of all trouble is in cache_check(). Currently, if a thread has set CACHE_PENDING, is works using a possibly no longer valid state of the cache entry (rv). AFAICS, it would fix most of the problems to re-check the cache entry's state between setting CACHE_PENDING and the upcall. The upcall should be done only, if still necessary. This one could be combined with a new bit in the entry's state, that is set if a valid entry is updated (that is: replaced). Checking this bit also immediately before cache_make_upcall() is called would also fix my second scenario fully in that it avoids unnecessary upcalls. Bodo > --- > net/sunrpc/cache.c | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c index 9afa439..b48c8ef 100644 > --- a/net/sunrpc/cache.c > +++ b/net/sunrpc/cache.c > @@ -1022,6 +1022,9 @@ static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch) > struct cache_request *cr = container_of(cq, struct cache_request, q); > if (cr->item != ch) > continue; > + if (test_bit(CACHE_PENDING, &ch->flags)) > + /* Lost a race and it is pending again */ > + break; > if (cr->readers != 0) > continue; > list_del(&cr->q.list); > @@ -1151,6 +1154,7 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h, > struct cache_request *crq; > char *bp; > int len; > + int ret = 0; > > if (!cache_listeners_exist(detail)) { > warn_no_listener(detail); > @@ -1182,10 +1186,18 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h, > crq->len = PAGE_SIZE - len; > crq->readers = 0; > spin_lock(&queue_lock); > - list_add_tail(&crq->q.list, &detail->queue); > + if (test_bit(CACHE_PENDING, &h->flags)) > + list_add_tail(&crq->q.list, &detail->queue); > + else > + /* Lost a race, no longer PENDING, so don't enqueue */ > + ret = -EAGAIN; > spin_unlock(&queue_lock); > wake_up(&queue_wait); > - return 0; > + if (ret == -EAGAIN) { > + kfree(buf); > + kfree(crq); > + } > + return ret; > } > EXPORT_SYMBOL_GPL(sunrpc_cache_pipe_upcall); > > ÿôèº{.nÇ+?·?®??+%?Ëÿ±éݶ¥?wÿº{.nÇ+?·¥?{±þwìþ)í?æèw*jg¬±¨¶????Ý¢jÿ¾«þG«?éÿ¢¸¢·¦j:+v?¨?wèjØm¶?ÿþø¯ù®w¥þ?àþf£¢·h??â?úÿ?Ù¥