On Fri, 2012-05-18 at 16:16 -0400, Jeff Layton wrote: > xprt_alloc_slot will call rpc_delay() to make the task wait a bit before > retrying when it gets back an -ENOMEM error from xprt_dynamic_alloc_slot. > > The problem there is that rpc_delay does this: > > rpc_sleep_on(&delay_queue, task, __rpc_atrun); > > ...and __rpc_atrun will reset the task->tk_status back to 0 when the > task wakes back up. This makes call_reserveresult abort the task with > -EIO. > > The RH QA group discovered this problem when testing with low-memory > clients and this patch fixed it: > > https://bugzilla.redhat.com/show_bug.cgi?id=822189 > > Fix this by adding a new rpc_delay_action() variant that makes allows > us to pass in a rpc_action to reset the status back to -EAGAIN instead. Hi Jeff, I'm not sure that I like the idea of an rpc_delay variant that fails to reset the tk_status. How about something like the following solution instead? Cheers Trond 8<--------------------------------------------------------------------- >From 1afeaf5c29aa07db25760d2fbed5c08a3aec3498 Mon Sep 17 00:00:00 2001 From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Date: Sat, 19 May 2012 12:12:53 -0400 Subject: [PATCH] sunrpc: fix loss of task->tk_status after rpc_delay call in xprt_alloc_slot xprt_alloc_slot will call rpc_delay() to make the task wait a bit before retrying when it gets back an -ENOMEM error from xprt_dynamic_alloc_slot. The problem is that rpc_delay will clear the task->tk_status, causing call_reserveresult to abort the task. The solution is simply to let call_reserveresult handle the ENOMEM error directly. Reported-by: Jeff Layton <jlayton@xxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx [>= 3.1] Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> --- net/sunrpc/clnt.c | 2 ++ net/sunrpc/xprt.c | 5 +++-- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index adf2990..25302c8 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1288,6 +1288,8 @@ call_reserveresult(struct rpc_task *task) } switch (status) { + case -ENOMEM: + rpc_delay(task, HZ >> 2); case -EAGAIN: /* woken up; retry */ task->tk_action = call_reserve; return; diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index b239e75..d7ccd79 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -984,15 +984,16 @@ static void xprt_alloc_slot(struct rpc_task *task) goto out_init_req; switch (PTR_ERR(req)) { case -ENOMEM: - rpc_delay(task, HZ >> 2); dprintk("RPC: dynamic allocation of request slot " "failed! Retrying\n"); + task->tk_status = -ENOMEM; break; case -EAGAIN: rpc_sleep_on(&xprt->backlog, task, NULL); dprintk("RPC: waiting for request slot\n"); + default: + task->tk_status = -EAGAIN; } - task->tk_status = -EAGAIN; return; out_init_req: task->tk_status = 0; -- 1.7.7.6 -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥