[PATCH] sunrpc.ko: RPC cache fix races

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Second attempt using the correct FROM. Sorry for the noise.


Hi,

I found a problem in sunrpc.ko on a SLES11 SP1 (2.6.32.59-0,7.1)
and fixed it (see first patch ifor 2.6.32.60 below).
For us the patch works fine (tested on 2.6.32.59-0.7.1).

AFAICS from the code, the problem seems to persist in current
kernel versions also. Thus, I added the second patch for 3.7.9.
As the setup to reproduce the problem is quite complex, I couldn't
test the second patch yet. So consider this one as a RFC.

Best regards,
Bodo

Please CC me, I'm not on the list.

=========================================
From: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx>
Date: Fri, 08 Feb 2013
Subject: [PATCH] net: sunrpc: fix races in RPC cache

We found the problem and tested the patch on a SLES11 SP1 2.6.32.59-0.7.1

This patch applies to linux-2.6.32.60 (changed monotonic_seconds -->
get_seconds())

Sporadically NFS3 RPC requests to the nfs server are dropped due to
cache_check() (net/sunrpc/cache.c) returning -ETIMEDOUT for an entry
of the "auth.unix.gid" cache.
In this case, no NFS reply is sent to the client.

The reason for the dropped requests are races in cache_check() when
cache_make_upcall() returns -EINVAL (because it is called for a cache
without readers) and cache_check() therefore refreshes the cache entry 
(rv == -EAGAIN).

There are three details that need to be changed:
 1) cache_revisit_request() must not be called before cache_fresh_locked()
    has updated the cache entry, as cache_revisit_request() wakes up
    threads waiting for the cache entry to be updated.
    The explicit call to cache_revisit_request() is not needed, as
    cache_fresh_unlocked() calls it anyway.
    (But in case of not updating the cache entry, cache_revisit_request()
    must be called. Thus, we use a fall through in the "case").
 2) CACHE_PENDING must not be cleared before cache_fresh_locked() has
    updated the cache entry, as cache_defer_req() can return without really
    sleeping if it detects CACHE_PENDING unset.
    (In case of not updating the cache entry again we use the fall through)
 3) Imagine a thread that calls cache_check() and gets rv = -ENOENT from
    cache_is_valid(). Then it sets CACHE_PENDINGs and calls
    cache_make_upcall().
    We assume that meanwhile get_seconds() advances to the next
    sec. and a second thread also calls cache_check(). It gets -EAGAIN from
    cache_is_valid() for the same cache entry. As CACHE_PENDING still is
    set, it calls cache_defer_req() immediately and waits for a wakeup from
    the first thread.
    After cache_make_upcall() returned -EINVAL, the first thread does not
    update the cache entry as it had got rv = -ENOENT, but wakes up the
    second thread by calling cache_revisit_request().
    Thread two wakes up, calls cache_is_valid() and again gets -EAGAIN.
    Thus, the result of the second cache_check() is -ETIMEDOUT and the
    NFS request is dropped.
    To solve this, the cache entry now is updated not only if rv == -EAGAIN
    but if rv == -ENOENT also. This is a sufficient workaround, as the
    first thread would have to stay in cache_check() between its call to
    cache_is_valid() and clearing CACHE_PENDING for more than 60 seconds
    to break the workaround.
    
Signed-off-by: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx>
---

--- a/net/sunrpc/cache.c	2012-08-08 21:35:09.000000000 +0200
+++ b/net/sunrpc/cache.c	2013-02-08 14:29:41.000000000 +0100
@@ -233,15 +233,14 @@ int cache_check(struct cache_detail *det
 		if (!test_and_set_bit(CACHE_PENDING, &h->flags)) {
 			switch (cache_make_upcall(detail, h)) {
 			case -EINVAL:
-				clear_bit(CACHE_PENDING, &h->flags);
-				cache_revisit_request(h);
-				if (rv == -EAGAIN) {
+				if (rv == -EAGAIN || rv == -ENOENT) {
 					set_bit(CACHE_NEGATIVE, &h->flags);
 					cache_fresh_locked(h, get_seconds()+CACHE_NEW_EXPIRY);
+					clear_bit(CACHE_PENDING, &h->flags);
 					cache_fresh_unlocked(h, detail);
 					rv = -ENOENT;
+					break;
 				}
-				break;
 
 			case -EAGAIN:
 				clear_bit(CACHE_PENDING, &h->flags);
=========================================
From: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx>
Date: Fri, 08 Feb 2013
Subject: [PATCH] net: sunrpc: fix races in RPC cache

This patch applies to SLES 11 SP2 linux-3.0.51-0.7.9 and also to
vanilla linux-3.7.9

It is untested and is only based on a code review after we
analyzed the reason for NFS requests being dropped on a
SLES11 SP1 (linux-2.6.32.59-0.7.1)

Sporadically NFS3 RPC requests to the nfs server are dropped due to
cache_check() (net/sunrpc/cache.c) returning -ETIMEDOUT for an entry
of the "auth.unix.gid" cache.
In this case, no NFS reply is sent to the client.

The reason for the dropped requests are races in cache_check() when
cache_make_upcall() returns -EINVAL (because it is called for a cache
without readers) and cache_check() therefore refreshes the cache entry 
(rv == -EAGAIN).

There are two details that need to be changed:
 1) cache_revisit_request() must not be called before cache_fresh_locked()
    has updated the cache entry, as cache_revisit_request() wakes up
    threads waiting for the cache entry to be updated.
    The explicit call to cache_revisit_request() is not needed, as
    cache_fresh_unlocked() calls it anyway.
    (But in case of not updating the cache entry, cache_revisit_request()
    must be called).
 2) CACHE_PENDING must not be cleared before cache_fresh_locked() has
    updated the cache entry, as cache_wait_req() called by
    cache_defer_req() can return without really sleeping if it detects
    CACHE_PENDING unset.

Signed-off-by: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx>
---

--- a/net/sunrpc/cache.c	2013-02-08 15:56:07.000000000 +0100
+++ b/net/sunrpc/cache.c	2013-02-08 16:04:32.000000000 +0100
@@ -230,11 +230,14 @@ static int try_to_negate_entry(struct ca
 	rv = cache_is_valid(detail, h);
 	if (rv != -EAGAIN) {
 		write_unlock(&detail->hash_lock);
+		clear_bit(CACHE_PENDING, &h->flags);
+		cache_revisit_request(h);
 		return rv;
 	}
 	set_bit(CACHE_NEGATIVE, &h->flags);
 	cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY);
 	write_unlock(&detail->hash_lock);
+	clear_bit(CACHE_PENDING, &h->flags);
 	cache_fresh_unlocked(h, detail);
 	return -ENOENT;
 }
@@ -275,8 +278,6 @@ int cache_check(struct cache_detail *det
 		if (!test_and_set_bit(CACHE_PENDING, &h->flags)) {
 			switch (cache_make_upcall(detail, h)) {
 			case -EINVAL:
-				clear_bit(CACHE_PENDING, &h->flags);
-				cache_revisit_request(h);
 				rv = try_to_negate_entry(detail, h);
 				break;
 			case -EAGAIN:
ÿôèº{.nÇ+?·?®?­?+%?Ëÿ±éݶ¥?wÿº{.nÇ+?·¥?{±þwìþ)í?æèw*jg¬±¨¶????Ý¢jÿ¾«þG«?éÿ¢¸¢·¦j:+v?¨?wèjØm¶?ÿþø¯ù®w¥þ?àþf£¢·h??â?úÿ?Ù¥



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux