On Tue, 22 Mar 2022, trondmy@xxxxxxxxxx wrote: > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > Make sure that rpciod and xprtiod are always using the same slab > allocation modes. > > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> .... > xs_stream_prepare_request(struct rpc_rqst *req) > { > xdr_free_bvec(&req->rq_rcv_buf); > - req->rq_task->tk_status = xdr_alloc_bvec(&req->rq_rcv_buf, GFP_KERNEL); > + req->rq_task->tk_status = xdr_alloc_bvec( > + &req->rq_rcv_buf, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN); > } I did some testing of swap-over-NFS, and got a crash quite quickly, due to this change. The problem is that GFP_KERNEL allocations almost never fail. Multi-page allocations might occasionally fail, and others might fail for a process that has been killed by the OOM killer (or maybe just has a fatal signal pending), but in general GFP_KERNEL is more likely to wait (and wait and wait) than to fail. So the failure paths haven't been tested. xs_stream_prepare_request() is called from xprt_request_prepare(), which is called from xprt_request_enqueue_receive() which is called in call_encode() *after* ->tk_status has been tested. So when the above code sets ->tk_status to -ENOMEM - which is now more likely - that fact is ignored and we get a crash [ 298.911356] Workqueue: xprtiod xs_stream_data_receive_workfn [ 298.911696] RIP: 0010:_copy_to_iter+0x1cc/0x435 .. [ 298.918259] __skb_datagram_iter+0x64/0x225 [ 298.918507] skb_copy_datagram_iter+0xe9/0xf2 [ 298.918767] tcp_recvmsg_locked+0x653/0x77e [ 298.919015] tcp_recvmsg+0x100/0x188 [ 298.919226] inet_recvmsg+0x5d/0x86 [ 298.919431] xs_read_stream_request.constprop.0+0x247/0x378 [ 298.919754] xs_read_stream.constprop.0+0x1c2/0x39b [ 298.920038] xs_stream_data_receive_workfn+0x50/0x160 [ 298.920331] process_one_work+0x267/0x422 [ 298.920568] worker_thread+0x193/0x234 So we really need to audit all these places where we add __GFP_NORETRY and ensure errors are actually handled. For call_encode(), it might be easiest to move /* Add task to reply queue before transmission to avoid races */ if (rpc_reply_expected(task)) xprt_request_enqueue_receive(task); up before the /* Did the encode result in an error condition? */ if (task->tk_status != 0) { and change it to /* Add task to reply queue before transmission to avoid races */ if (task->tk_status == 0 && rpc_reply_expected(task)) xprt_request_enqueue_receive(task); I'll try a bit more testing and auditing. Thanks, NeilBrown