> On May 21, 2022, at 9:24 PM, Kinglong Mee <kinglongmee@xxxxxxxxx> wrote: > > Hi Chuck, > > On 2022/5/22 12:33 AM, Chuck Lever III wrote: >>> On May 21, 2022, at 5:51 AM, Kinglong Mee <kinglongmee@xxxxxxxxx> wrote: >>> >>> When rdma server returns a fault reply, rpcrdma may treats it as a bcall. >>> As using NFSv3, a bc server is never exist. >>> rpcrdma_bc_receive_call will meets NULL pointer as, >>> >>> [ 226.057890] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8 >>> ... >>> [ 226.058704] RIP: 0010:_raw_spin_lock+0xc/0x20 >>> ... >>> [ 226.059732] Call Trace: >>> [ 226.059878] rpcrdma_bc_receive_call+0x138/0x327 [rpcrdma] >>> [ 226.060011] __ib_process_cq+0x89/0x170 [ib_core] >>> [ 226.060092] ib_cq_poll_work+0x26/0x80 [ib_core] >>> [ 226.060257] process_one_work+0x1a7/0x360 >>> [ 226.060367] ? create_worker+0x1a0/0x1a0 >>> [ 226.060440] worker_thread+0x30/0x390 >>> [ 226.060500] ? create_worker+0x1a0/0x1a0 >>> [ 226.060574] kthread+0x116/0x130 >>> [ 226.060661] ? kthread_flush_work_fn+0x10/0x10 >>> [ 226.060724] ret_from_fork+0x35/0x40 >>> ... >>> >>> Signed-off-by: Kinglong Mee <kinglongmee@xxxxxxxxx> >>> --- >>> net/sunrpc/xprtrdma/rpc_rdma.c | 5 +++++ >>> 1 file changed, 5 insertions(+) >>> >>> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c >>> index 281ddb87ac8d..9486bb98eb2f 100644 >>> --- a/net/sunrpc/xprtrdma/rpc_rdma.c >>> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c >>> @@ -1121,9 +1121,14 @@ static bool >>> rpcrdma_is_bcall(struct rpcrdma_xprt *r_xprt, struct rpcrdma_rep *rep) >>> #if defined(CONFIG_SUNRPC_BACKCHANNEL) >>> { >>> + struct rpc_xprt *xprt = &r_xprt->rx_xprt; >>> struct xdr_stream *xdr = &rep->rr_stream; >>> __be32 *p; >>> >>> + /* no bc service, not a bcall. */ >>> + if (xprt->bc_serv == NULL) >>> + return false; >>> + >>> if (rep->rr_proc != rdma_msg) >>> return false; >> I'm not sure what you mean above by "fault reply". >> The check here for whether the RPC/RDMA procedure is an RDMA_MSG >> is supposed to be enough to avoid any further processing of an >> RDMA_ERR type procedure. >> What kind of fault has occurred? Can you share with us the >> actual RPC/RDMA transport header that triggers the BUG? > > I have a nfs rdma server, but it handles drc reply wrongly sometimes, > it returns a bad format reply to nfs client. > Nfs rdma client treats the bad format reply as a bcall, and Doesn't this test return false: 1144 if (*p != cpu_to_be32(RPC_CALL)) 1145 return false; But OK: a malicious NFSv3 server can trigger a client crash. That's a bug. This is an additional conditional in a hot path. Would it make sense to move the new test to just after 1145? Even then, it could be a bcall, the client simply isn't prepared to process it. So "/* no bc service */" by itself would be a more accurate comment. -- Chuck Lever