On Mon, 2013-01-07 at 15:29 -0500, Trond Myklebust wrote: +AD4- On Mon, 2013-01-07 at 15:20 -0500, Chris Perl wrote: +AD4- +AD4- On Mon, Jan 07, 2013 at 07:50:10PM +-0000, Myklebust, Trond wrote: +AD4- +AD4- +AD4- Hi Chris, +AD4- +AD4- +AD4- +AD4- +AD4- +AD4- Excellent sleuthing+ACE- Given the thoroughness of your explanation, I'm +AD4- +AD4- +AD4- pretty sure that the attached patch should fix the problem. +AD4- +AD4- +AD4- +AD4- +AD4- +AD4- Cheers +AD4- +AD4- +AD4- Trond +AD4- +AD4- +AD4- -- +AD4- +AD4- +AD4- Trond Myklebust +AD4- +AD4- +AD4- Linux NFS client maintainer +AD4- +AD4- +AD4- +AD4- +AD4- +AD4- NetApp +AD4- +AD4- +AD4- Trond.Myklebust+AEA-netapp.com +AD4- +AD4- +AD4- www.netapp.com +AD4- +AD4- +AD4- +AD4- +AD4- From ec8cbb4aff21cd0eac2c6f3fc4273ac72cdd91ef Mon Sep 17 00:00:00 2001 +AD4- +AD4- +AD4- From: Trond Myklebust +ADw-Trond.Myklebust+AEA-netapp.com+AD4- +AD4- +AD4- +AD4- Date: Mon, 7 Jan 2013 14:30:46 -0500 +AD4- +AD4- +AD4- Subject: +AFs-PATCH+AF0- SUNRPC: Ensure we release the socket write lock if the +AD4- +AD4- +AD4- rpc+AF8-task exits early +AD4- +AD4- +AD4- +AD4- +AD4- +AD4- If the rpc+AF8-task exits while holding the socket write lock before it has +AD4- +AD4- +AD4- allocated an rpc slot, then the usual mechanism for releasing the write +AD4- +AD4- +AD4- lock in xprt+AF8-release() is defeated. +AD4- +AD4- +AD4- +AD4- +AD4- +AD4- The problem occurs if the call to xprt+AF8-lock+AF8-write() initially fails, so +AD4- +AD4- +AD4- that the rpc+AF8-task is put on the xprt-+AD4-sending wait queue. If the task +AD4- +AD4- +AD4- exits after being assigned the lock by +AF8AXw-xprt+AF8-lock+AF8-write+AF8-func, but +AD4- +AD4- +AD4- before it has retried the call to xprt+AF8-lock+AF8-and+AF8-alloc+AF8-slot(), then +AD4- +AD4- +AD4- it calls xprt+AF8-release() while holding the write lock, but will +AD4- +AD4- +AD4- immediately exit due to the test for task-+AD4-tk+AF8-rqstp +ACEAPQ- NULL. +AD4- +AD4- +AD4- +AD4- +AD4- +AD4- Reported-by: Chris Perl +ADw-chris.perl+AEA-gmail.com+AD4- +AD4- +AD4- +AD4- Signed-off-by: Trond Myklebust +ADw-Trond.Myklebust+AEA-netapp.com+AD4- +AD4- +AD4- +AD4- Cc: stable+AEA-vger.kernel.org +AFsAPgA9- 3.1+AF0- +AD4- +AD4- +AD4- --- +AD4- +AD4- +AD4- net/sunrpc/xprt.c +AHw- 6 +-+-+-+--- +AD4- +AD4- +AD4- 1 file changed, 4 insertions(+-), 2 deletions(-) +AD4- +AD4- +AD4- +AD4- +AD4- +AD4- diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c +AD4- +AD4- +AD4- index bd462a5..6676457 100644 +AD4- +AD4- +AD4- --- a/net/sunrpc/xprt.c +AD4- +AD4- +AD4- +-+-+- b/net/sunrpc/xprt.c +AD4- +AD4- +AD4- +AEAAQA- -1136,10 +-1136,12 +AEAAQA- static void xprt+AF8-request+AF8-init(struct rpc+AF8-task +ACo-task, struct rpc+AF8-xprt +ACo-xprt) +AD4- +AD4- +AD4- void xprt+AF8-release(struct rpc+AF8-task +ACo-task) +AD4- +AD4- +AD4- +AHs- +AD4- +AD4- +AD4- struct rpc+AF8-xprt +ACo-xprt+ADs- +AD4- +AD4- +AD4- - struct rpc+AF8-rqst +ACo-req+ADs- +AD4- +AD4- +AD4- +- struct rpc+AF8-rqst +ACo-req +AD0- task-+AD4-tk+AF8-rqstp+ADs- +AD4- +AD4- +AD4- +AD4- +AD4- +AD4- - if (+ACE-(req +AD0- task-+AD4-tk+AF8-rqstp)) +AD4- +AD4- +AD4- +- if (req +AD0APQ- NULL) +AHs- +AD4- +AD4- +AD4- +- xprt+AF8-release+AF8-write(task-+AD4-tk+AF8-xprt, task)+ADs- +AD4- +AD4- +AD4- return+ADs- +AD4- +AD4- +AD4- +- +AH0- +AD4- +AD4- +AD4- +AD4- +AD4- +AD4- xprt +AD0- req-+AD4-rq+AF8-xprt+ADs- +AD4- +AD4- +AD4- if (task-+AD4-tk+AF8-ops-+AD4-rpc+AF8-count+AF8-stats +ACEAPQ- NULL) +AD4- +AD4- +AD4- -- +AD4- +AD4- +AD4- 1.7.11.7 +AD4- +AD4- +AD4- +AD4- +AD4- +AD4- +AD4- Ah, I totally missed the call to +AGA-rpc+AF8-release+AF8-task' at the bottom of the +AD4- +AD4- +AGAAXwBf-rpc+AF8-execute' loop (at least thats how I think we'd get to this function +AD4- +AD4- you're patching). +AD4- +AD4- +AD4- +AD4- But wouldn't we need to update the call site in +AD4- +AD4- +AGA-rpc+AF8-release+AF8-resources+AF8-task' as well? It contains an explicit check for +AD4- +AD4- +AGA-task-+AD4-tk+AF8-rqstp' being non null. +AD4- +AD4- Ewww.... You're right: That's a wart that seems to have been copied and +AD4- pasted quite a few times. +AD4- +AD4- Here is v2... ...and a v3 that adds a small optimisation to avoid taking the transport lock in cases where we really don't need it. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust+AEA-netapp.com www.netapp.com
From 51b63a538c54cb9c3b83c4d62572cf18da165cba Mon Sep 17 00:00:00 2001 From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Date: Mon, 7 Jan 2013 14:30:46 -0500 Subject: [PATCH v3] SUNRPC: Ensure we release the socket write lock if the rpc_task exits early If the rpc_task exits while holding the socket write lock before it has allocated an rpc slot, then the usual mechanism for releasing the write lock in xprt_release() is defeated. The problem occurs if the call to xprt_lock_write() initially fails, so that the rpc_task is put on the xprt->sending wait queue. If the task exits after being assigned the lock by __xprt_lock_write_func, but before it has retried the call to xprt_lock_and_alloc_slot(), then it calls xprt_release() while holding the write lock, but will immediately exit due to the test for task->tk_rqstp != NULL. Reported-by: Chris Perl <chris.perl@xxxxxxxxx> Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx [>= 3.1] --- net/sunrpc/sched.c | 3 +-- net/sunrpc/xprt.c | 8 ++++++-- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c index b4133bd..bfa3171 100644 --- a/net/sunrpc/sched.c +++ b/net/sunrpc/sched.c @@ -972,8 +972,7 @@ static void rpc_async_release(struct work_struct *work) static void rpc_release_resources_task(struct rpc_task *task) { - if (task->tk_rqstp) - xprt_release(task); + xprt_release(task); if (task->tk_msg.rpc_cred) { put_rpccred(task->tk_msg.rpc_cred); task->tk_msg.rpc_cred = NULL; diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index bd462a5..6acc0c5 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -1136,10 +1136,14 @@ static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt) void xprt_release(struct rpc_task *task) { struct rpc_xprt *xprt; - struct rpc_rqst *req; + struct rpc_rqst *req = task->tk_rqstp; - if (!(req = task->tk_rqstp)) + if (req == NULL) { + xprt = task->tk_xprt; + if (xprt->snd_task == task) + xprt_release_write(xprt, task); return; + } xprt = req->rq_xprt; if (task->tk_ops->rpc_count_stats != NULL) -- 1.7.11.7