On Thu, Nov 29 2018, NeilBrown wrote: > On Wed, Nov 28 2018, J. Bruce Fields wrote: > >> On Wed, Nov 28, 2018 at 11:45:46AM +0300, Vasily Averin wrote: >>> Dear all, we have found memory leak on OpenVz7 node and believe it >>> affects mainline too. >>> >>> sunrpc_cache_lookup() removes exprired cache_head from hash, however >>> if it waits for reply on submitted cache_request both of them can leak >>> forever, nobody cleans unhashed cache_heads. >>> >>> Originally we had claim on busy loop device of stopped container, that >>> had executed nfs server inside. Device was kept by mount that was >>> detached from already destroyed mount namespace. By using crash >>> search we have found some structure with path struct related to our >>> mount. Finally we have found that it was alive svc_export struct used >>> by to alive cache_request, however both of them pointed to already >>> freed cache_detail. >>> >>> We decided that cache_detail was correctly freed during destroy of net >>> namespace, however svc_export with taken path struct, cache_request >>> and some other structures seems was leaked forever. >>> >>> This could happen only if cache_head of svc_export was removed from >>> hash on cache_detail before its destroy. Finally we have found that it >>> could happen when sunrpc_cache_lookup() removes expired cache_head >>> from hash. >>> >>> Usually it works correctly and cache_put(freeme) frees expired >>> cache_head. However in our case cache_head have an extra reference >>> counter from stalled cache_request. Becasue of cache_head was removed >>> from hash of cache_detail it cannot be found in cache_clean() and its >>> cache_request cannot be freed in cache_dequeue(). Memory leaks >>> forever, exactly like we observed. >>> >>> After may attempts we have reproduced this situation on OpenVz7 >>> kernel, however our reproducer is quite long and complex. >>> Unfortunately we still did not reproduced this problem on mainline >>> kernel and did not validated the patch yet. >>> >>> It would be great if someone advised us some simple way to trigger >>> described scenario. >> >> I think you should be able to produce hung upcalls by flushing the cache >> (exportfs -f), then stopping mountd, then trying to access the >> filesystem from a client. Does that help? >> >>> We are not sure that our patch is correct, please let us know if our >>> analyze missed something. >> >> It looks OK to me, but it would be helpful to have Neil's review too. > > Yes, it makes sense to me. > Reviewed-by: NeilBrown <neilb@xxxxxxxx> Unfortunately I was wrong. See below. NeilBrown From: NeilBrown <neilb@xxxxxxxx> Subject: [PATCH] sunrpc: don't mark uninitialised items as VALID. A recent commit added a call to cache_fresh_locked() when an expired item was found. The call sets the CACHE_VALID flag, so it is important that the item actually is valid. There are two ways it could be valid: 1/ If ->update has been called to fill in relevant content 2/ If CACHE_NEGATIVE is set, to say that content doesn't exist. An expired item that is waiting for an update will be neither. Setting CACHE_VALID will mean that a subsequent call to cache_put() will be likely to dereference uninitialised pointers. So we must make sure the item is valid, and we already have code to do that in try_to_negate_entry(). This takes the hash lock and so cannot be used directly, so take out the two lines that we need and use them. Now cache_fresh_locked() is certain to be called only on a valid item. Cc: stable@xxxxxxxxxx # 2.6.35 Fixes: 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued request") Signed-off-by: NeilBrown <neilb@xxxxxxxx> --- net/sunrpc/cache.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c index 12bb23b8e0c5..f5f8ce8c3443 100644 --- a/net/sunrpc/cache.c +++ b/net/sunrpc/cache.c @@ -105,6 +105,8 @@ static struct cache_head *sunrpc_cache_add_entry(struct cache_detail *detail, if (cache_is_expired(detail, tmp)) { hlist_del_init_rcu(&tmp->cache_list); detail->entries --; + if (cache_is_valid(tmp) == -EAGAIN) + set_bit(CACHE_NEGATIVE, &h->flags); cache_fresh_locked(tmp, 0, detail); freeme = tmp; break; -- 2.14.0.rc0.dirty
Attachment:
signature.asc
Description: PGP signature