Second attempt using the correct FROM. Sorry for the noise. Hi, I found a problem in sunrpc.ko on a SLES11 SP1 (2.6.32.59-0,7.1) and fixed it (see first patch ifor 2.6.32.60 below). For us the patch works fine (tested on 2.6.32.59-0.7.1). AFAICS from the code, the problem seems to persist in current kernel versions also. Thus, I added the second patch for 3.7.9. As the setup to reproduce the problem is quite complex, I couldn't test the second patch yet. So consider this one as a RFC. Best regards, Bodo Please CC me, I'm not on the list. ========================================= From: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx> Date: Fri, 08 Feb 2013 Subject: [PATCH] net: sunrpc: fix races in RPC cache We found the problem and tested the patch on a SLES11 SP1 2.6.32.59-0.7.1 This patch applies to linux-2.6.32.60 (changed monotonic_seconds --> get_seconds()) Sporadically NFS3 RPC requests to the nfs server are dropped due to cache_check() (net/sunrpc/cache.c) returning -ETIMEDOUT for an entry of the "auth.unix.gid" cache. In this case, no NFS reply is sent to the client. The reason for the dropped requests are races in cache_check() when cache_make_upcall() returns -EINVAL (because it is called for a cache without readers) and cache_check() therefore refreshes the cache entry (rv == -EAGAIN). There are three details that need to be changed: 1) cache_revisit_request() must not be called before cache_fresh_locked() has updated the cache entry, as cache_revisit_request() wakes up threads waiting for the cache entry to be updated. The explicit call to cache_revisit_request() is not needed, as cache_fresh_unlocked() calls it anyway. (But in case of not updating the cache entry, cache_revisit_request() must be called. Thus, we use a fall through in the "case"). 2) CACHE_PENDING must not be cleared before cache_fresh_locked() has updated the cache entry, as cache_defer_req() can return without really sleeping if it detects CACHE_PENDING unset. (In case of not updating the cache entry again we use the fall through) 3) Imagine a thread that calls cache_check() and gets rv = -ENOENT from cache_is_valid(). Then it sets CACHE_PENDINGs and calls cache_make_upcall(). We assume that meanwhile get_seconds() advances to the next sec. and a second thread also calls cache_check(). It gets -EAGAIN from cache_is_valid() for the same cache entry. As CACHE_PENDING still is set, it calls cache_defer_req() immediately and waits for a wakeup from the first thread. After cache_make_upcall() returned -EINVAL, the first thread does not update the cache entry as it had got rv = -ENOENT, but wakes up the second thread by calling cache_revisit_request(). Thread two wakes up, calls cache_is_valid() and again gets -EAGAIN. Thus, the result of the second cache_check() is -ETIMEDOUT and the NFS request is dropped. To solve this, the cache entry now is updated not only if rv == -EAGAIN but if rv == -ENOENT also. This is a sufficient workaround, as the first thread would have to stay in cache_check() between its call to cache_is_valid() and clearing CACHE_PENDING for more than 60 seconds to break the workaround. Signed-off-by: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx> --- --- a/net/sunrpc/cache.c 2012-08-08 21:35:09.000000000 +0200 +++ b/net/sunrpc/cache.c 2013-02-08 14:29:41.000000000 +0100 @@ -233,15 +233,14 @@ int cache_check(struct cache_detail *det if (!test_and_set_bit(CACHE_PENDING, &h->flags)) { switch (cache_make_upcall(detail, h)) { case -EINVAL: - clear_bit(CACHE_PENDING, &h->flags); - cache_revisit_request(h); - if (rv == -EAGAIN) { + if (rv == -EAGAIN || rv == -ENOENT) { set_bit(CACHE_NEGATIVE, &h->flags); cache_fresh_locked(h, get_seconds()+CACHE_NEW_EXPIRY); + clear_bit(CACHE_PENDING, &h->flags); cache_fresh_unlocked(h, detail); rv = -ENOENT; + break; } - break; case -EAGAIN: clear_bit(CACHE_PENDING, &h->flags); ========================================= From: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx> Date: Fri, 08 Feb 2013 Subject: [PATCH] net: sunrpc: fix races in RPC cache This patch applies to SLES 11 SP2 linux-3.0.51-0.7.9 and also to vanilla linux-3.7.9 It is untested and is only based on a code review after we analyzed the reason for NFS requests being dropped on a SLES11 SP1 (linux-2.6.32.59-0.7.1) Sporadically NFS3 RPC requests to the nfs server are dropped due to cache_check() (net/sunrpc/cache.c) returning -ETIMEDOUT for an entry of the "auth.unix.gid" cache. In this case, no NFS reply is sent to the client. The reason for the dropped requests are races in cache_check() when cache_make_upcall() returns -EINVAL (because it is called for a cache without readers) and cache_check() therefore refreshes the cache entry (rv == -EAGAIN). There are two details that need to be changed: 1) cache_revisit_request() must not be called before cache_fresh_locked() has updated the cache entry, as cache_revisit_request() wakes up threads waiting for the cache entry to be updated. The explicit call to cache_revisit_request() is not needed, as cache_fresh_unlocked() calls it anyway. (But in case of not updating the cache entry, cache_revisit_request() must be called). 2) CACHE_PENDING must not be cleared before cache_fresh_locked() has updated the cache entry, as cache_wait_req() called by cache_defer_req() can return without really sleeping if it detects CACHE_PENDING unset. Signed-off-by: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx> --- --- a/net/sunrpc/cache.c 2013-02-08 15:56:07.000000000 +0100 +++ b/net/sunrpc/cache.c 2013-02-08 16:04:32.000000000 +0100 @@ -230,11 +230,14 @@ static int try_to_negate_entry(struct ca rv = cache_is_valid(detail, h); if (rv != -EAGAIN) { write_unlock(&detail->hash_lock); + clear_bit(CACHE_PENDING, &h->flags); + cache_revisit_request(h); return rv; } set_bit(CACHE_NEGATIVE, &h->flags); cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY); write_unlock(&detail->hash_lock); + clear_bit(CACHE_PENDING, &h->flags); cache_fresh_unlocked(h, detail); return -ENOENT; } @@ -275,8 +278,6 @@ int cache_check(struct cache_detail *det if (!test_and_set_bit(CACHE_PENDING, &h->flags)) { switch (cache_make_upcall(detail, h)) { case -EINVAL: - clear_bit(CACHE_PENDING, &h->flags); - cache_revisit_request(h); rv = try_to_negate_entry(detail, h); break; case -EAGAIN: ÿôèº{.nÇ+?·?®??+%?Ëÿ±éݶ¥?wÿº{.nÇ+?·¥?{±þwìþ)í?æèw*jg¬±¨¶????Ý¢jÿ¾«þG«?éÿ¢¸¢·¦j:+v?¨?wèjØm¶?ÿþø¯ù®w¥þ?àþf£¢·h??â?úÿ?Ù¥