On 2022/5/22 9:41 AM, Chuck Lever III wrote:
On May 21, 2022, at 9:24 PM, Kinglong Mee <kinglongmee@xxxxxxxxx> wrote:
Hi Chuck,
On 2022/5/22 12:33 AM, Chuck Lever III wrote:
On May 21, 2022, at 5:51 AM, Kinglong Mee <kinglongmee@xxxxxxxxx> wrote:
When rdma server returns a fault reply, rpcrdma may treats it as a bcall.
As using NFSv3, a bc server is never exist.
rpcrdma_bc_receive_call will meets NULL pointer as,
[ 226.057890] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
...
[ 226.058704] RIP: 0010:_raw_spin_lock+0xc/0x20
...
[ 226.059732] Call Trace:
[ 226.059878] rpcrdma_bc_receive_call+0x138/0x327 [rpcrdma]
[ 226.060011] __ib_process_cq+0x89/0x170 [ib_core]
[ 226.060092] ib_cq_poll_work+0x26/0x80 [ib_core]
[ 226.060257] process_one_work+0x1a7/0x360
[ 226.060367] ? create_worker+0x1a0/0x1a0
[ 226.060440] worker_thread+0x30/0x390
[ 226.060500] ? create_worker+0x1a0/0x1a0
[ 226.060574] kthread+0x116/0x130
[ 226.060661] ? kthread_flush_work_fn+0x10/0x10
[ 226.060724] ret_from_fork+0x35/0x40
...
Signed-off-by: Kinglong Mee <kinglongmee@xxxxxxxxx>
---
net/sunrpc/xprtrdma/rpc_rdma.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index 281ddb87ac8d..9486bb98eb2f 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -1121,9 +1121,14 @@ static bool
rpcrdma_is_bcall(struct rpcrdma_xprt *r_xprt, struct rpcrdma_rep *rep)
#if defined(CONFIG_SUNRPC_BACKCHANNEL)
{
+ struct rpc_xprt *xprt = &r_xprt->rx_xprt;
struct xdr_stream *xdr = &rep->rr_stream;
__be32 *p;
+ /* no bc service, not a bcall. */
+ if (xprt->bc_serv == NULL)
+ return false;
+
if (rep->rr_proc != rdma_msg)
return false;
I'm not sure what you mean above by "fault reply".
The check here for whether the RPC/RDMA procedure is an RDMA_MSG
is supposed to be enough to avoid any further processing of an
RDMA_ERR type procedure.
What kind of fault has occurred? Can you share with us the
actual RPC/RDMA transport header that triggers the BUG?
I have a nfs rdma server, but it handles drc reply wrongly sometimes,
it returns a bad format reply to nfs client.
Nfs rdma client treats the bad format reply as a bcall, and
Doesn't this test return false:
1144 if (*p != cpu_to_be32(RPC_CALL))
1145 return false;
No, it doesn't return false here.
The debug message at rpcrdma_bc_receive_call are,
[56579.837169] RPC: rpcrdma_bc_receive_call: callback XID
00000001, length=20
[56579.837174] RPC: rpcrdma_bc_receive_call: 00 00 00 01 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 04
But OK: a malicious NFSv3 server can trigger a client crash.
That's a bug.
This is an additional conditional in a hot path. Would it make
sense to move the new test to just after 1145? >
Even then, it could be a bcall, the client simply isn't
prepared to process it. So "/* no bc service */" by itself
would be a more accurate comment.
Okay, I will update it and send a v2 patch after moving the checking
after 1145.
Thanks,
Kinglong Mee