On Wed, Aug 11, 2021 at 4:01 PM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: > > > > > On Aug 11, 2021, at 3:46 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote: > > > > On Wed, Aug 11, 2021 at 2:52 PM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: > >> > >> > >> > >>> On Aug 11, 2021, at 2:38 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote: > >>> > >>> On Wed, Aug 11, 2021 at 1:30 PM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: > >>>> > >>>> > >>>> > >>>>> On Aug 11, 2021, at 12:20 PM, Timo Rothenpieler <timo@xxxxxxxxxxxxxxxx> wrote: > >>>>> > >>>>> resulting dmesg and trace logs of both client and server are attached. > >>>>> > >>>>> Test procedure: > >>>>> > >>>>> - start tracing on client and server > >>>>> - mount NFS on client > >>>>> - immediately run 'xfs_io -fc "copy_range testfile" testfile.copy' (which succeeds) > >>>>> - wait 10~15 minutes for the backchannel to time out (still running 5.12.19 with the fix for that reverted) > >>>>> - run xfs_io command again, getting stuck now > >>>>> - let it sit there stuck for a minute, then cancel it > >>>>> - run the command again > >>>>> - while it's still stuck, finished recording the logs and traces > >>>> > >>>> The server tries to send CB_OFFLOAD when the offloaded copy > >>>> completes, but finds the backchannel transport is not connected. > >>>> > >>>> The server can't report the problem until the client sends a > >>>> SEQUENCE operation, but there's really no other traffic going > >>>> on, so it just waits. > >>>> > >>>> The client eventually sends a singleton SEQUENCE to renew its > >>>> lease. The server replies with the SEQ4_STATUS_BACKCHANNEL_FAULT > >>>> flag set at that point. Client's recovery is to destroy that > >>>> session and create a new one. That appears to be successful. > >>>> > >>>> But the server doesn't send another CB_OFFLOAD to let the client > >>>> know the copy is complete, so the client hangs. > >>>> > >>>> This seems to be peculiar to COPY_OFFLOAD, but I wonder if the > >>>> other CB operations suffer from the same "failed to retransmit > >>>> after the CB path is restored" issue. It might not matter for > >>>> some of them, but for others like CB_RECALL, that could be > >>>> important. > >>> > >>> Thank you for the analysis Chuck (btw I haven't seen any attachments > >>> with Timo's posts so I'm assuming some offline communication must have > >>> happened). > >>> ? > >>> I'm looking at the code and wouldn't the mentioned flags be set on the > >>> CB_SEQUENCE operation? > >> > >> CB_SEQUENCE is sent from server to client, and that can't work if > >> the callback channel is down. > >> > >> So the server waits for the client to send a SEQUENCE and it sets > >> the SEQ4_STATUS_BACKCHANNEL_FAULT in its reply. > > > > yes scratch that, this is for when CB_SEQUENCE has it in its reply. > > > >>> nfsd4_cb_done() has code to mark the channel > >>> and retry (or another way of saying this, this code should generically > >>> handle retrying whatever operation it is be it CB_OFFLOAD or > >>> CB_RECALL)? > >> > >> cb_done() marks the callback fault, but as far as I can tell the > >> RPC is terminated at that point and there is no subsequent retry. > >> The RPC_TASK flags on the CB_OFFLOAD operation cause that RPC to > >> fail immediately if there's no connection. > >> > >> And in the BACKCHANNEL_FAULT case, the bc_xprt is destroyed as > >> part of recovery. I think that would kill all pending RPC tasks. > >> > >> > >>> Is that not working (not sure if this is a question or a > >>> statement).... I would think that would be the place to handle this > >>> problem. > >> > >> IMHO the OFFLOAD code needs to note that the CB_OFFLOAD RPC > >> failed and then try the call again once the new backchannel is > >> available. > > > > I still argue that this needs to be done generically not per operation > > as CB_RECALL has the same problem. > > Probably not just CB_RECALL, but agreed, there doesn't seem to > be any mechanism that can re-drive callback operations when the > backchannel is replaced. > > > > If CB_RECALL call is never > > answered, rpc would fail with ETIMEOUT. > > Well we have a mechanism already to deal with that case, which is > delegation revocation. That's not optimal, but at least it won't > leave the client hanging forever. > > > > nfsd4_cb_recall_done() returns > > 1 back to the nfsd4_cb_done() which then based on the rpc task status > > would set the callback path down but will do nothing else. > > nfsd4_cb_offload_one() just always returns 1. > > > > Now given that you say during the backchannel fault case, we'll have > > to destroy that backchannel and create a new one. > > BACKCHANNEL_FAULT is the particular case that Timo's reproducer > exercises. I haven't looked closely at the CB_PATH_DOWN and > CB_PATH_DOWN_SESSION cases, which use BIND_CONN_TO_SESSION, > to see if those also destroy the backchannel transport. > > However I suspect they are just as broken. There isn't much > regression testing around these scenarios. > > > > Are we losing all those RPCs that flagged it as faulty? > > I believe the RPC tasks are already gone at that point. I'm > just saying that /hypothetically/ if the RPCs were not set up > to exit immediately, that wouldn't matter because the transport > and client are destroyed as part of restoring the backchannel. > > Perhaps instead of destroying the rpc_clnt, the server should > retain it and simply recycle the underlying transport data > structures. > > > > At this point, nothing obvious > > comes to mind how to engineer the recovery but it should be done. > > Well I think we need to: > > 1. Review the specs to see if there's any indication of how to > recover from this situation I guess one fix we can consider on the client is to make waiting for the CB_OFFLOAD time based and then use OFFLOAD_STATUS operation to check on the copy status. > 2. Perhaps construct a way to move CB RPCs to a workqueue so > they can be retried indefinitely (or figure out when they should > stop being retried: COPY_OFFLOAD_CANCEL comes to mind). > > 3. Keep an eye peeled for cases where a malicious or broken > client could cause CB operations to tie up all the server's > nfsd threads or other resources. > > -- > Chuck Lever > > >