On 1/22/25 11:50 AM, Jeff Layton wrote:
On Wed, 2025-01-22 at 11:06 -0500, Chuck Lever wrote:
On 1/22/25 10:44 AM, Jeff Layton wrote:
On Wed, 2025-01-22 at 10:20 -0500, Chuck Lever wrote:
On 1/22/25 10:10 AM, Jeff Layton wrote:
The v4.0 client always restarts the callback when the connection is shut
down (which is indicated by RPC_SIGNALLED()). The RPC is then requeued
and the result eventually should complete (or be aborted).
The v4.1 code instead processes the result and only later decides to
restart the call. Even more problematic is the fact that it releases the
slot beforehand. The restarted call may get a new slot, which would
could break DRC handling.
"break DRC handling" -- I'd like to understand this.
NFSD always sets cachethis to false in CB_SEQUENCE, so there is no DRC
for these operations. The only thing the client saves is the slot
sequence number IIUC.
Is retrying an uncached operation via a different slot a problem?
Ahh, I missed that we always set cachethis to false. So, there is
probably now a problem with the DRC after all. Still, I don't see a
good argument for processing the CB_SEQUENCE result, when we intend to
retransmit the call anyway.
I expect that the rationale is that the slot sequence number needs to be
advanced appropriately before the slot can be used again.
Once RPC_SIGNALLED returns true, the callback code can either trust the
result of the rpc_task or not. If it's going to trust that result, then
there is no need to restart the call.
If it's not going to trust it, then the RPC call might as well have not
happened, and there is no need to increment the slot sequence number or
do anything else.
Is my understanding wrong here?
It might be.
The callback client is careful to initialize cb_seq_status to 1 before
it sends the RPC call. Thus if cb_seq_status is any value other than 1
in nfsd4_cb_sequence_done(), that means an RPC reply was received
successfully and the XDR decoder was successful.
RPC_SIGNALLED doesn't have anything to do with whether a reply arrived
or can be trusted.
nfsd4_cb_sequence_done() needs to process the reply unconditionally,
otherwise the server and client will disagree on the slot sequence
number for that slot.
--
Chuck Lever