On 2/10/2025 8:52 AM, Chuck Lever wrote:
On 2/9/25 8:34 PM, Rick Macklem wrote:
On Sun, Feb 9, 2025 at 3:34 PM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
On Sun, 2025-02-09 at 13:39 -0800, Rick Macklem wrote:
Hi,
I thought I'd post here instead of nfsv4@xxxxxxxx since I
think the Linux server has been implementing this recently.
I am not interested in making the FreeBSD NFSv4.1/4.2
server dynamically resize slot tables in sessions, but I do
want to make sure the FreeBSD handles this case correctly.
Here is what I believe is supposed to be done:
For growing the slot table...
- Server/replier sends SEQUENCE replies with both
sr_highest_slot and sr_target_highest_slot set to a larger value.
--> The client can then use those slots with
sa_sequenceid set to 1 for the first SEQUENCE operation on
each of them.
For shrinking the slot table...
- Server/replier sends SEQUENCE replies with a smaller
value for sr_target_highest_slot.
- The server/replier waits for the client to do a SEQUENCE
operation on one of the slot(s) where the server has replied
with the smaller value for sr_target_highest_slot with a
sa_highest_slot value <= to the new smaller
sr_target_highest_slot
- Once this happens, the server/replier can set sr_highest_slot
to the lower value of sr_target_highest_slot and throw the
slot table entries above that value away.
--> Once the client sees a reply with sr_target_highest_slot set
to the lower value, it should not do any additional SEQUENCE
operations with a sa_slotid > sr_target_highest_slot
Does the above sound correct?
The above captures the case where the server is adjusting using
OP_SEQUENCE. However there is another potential mode where the server
sends out a CB_RECALL_SLOT.
Ouch. I completely forgot about this one and I'll admit the FreeBSD client
doesn't have it implemented.
The client is free to refuse to return slots, but the penalty may be
a forcible session disconnect.
I agree you've captured the basics of the graceful-reduction scenario,
but I do wonder if nconnect > 1 might impact the termination condition.
Because nconnect may impact the ordering of request arrival at the
server, it may be possible to have a timing window where one connection
may signal a reduction while another connection's request is still
outstanding?
Tom.
Just fyi, does the Linux server do this, or do I have some time to implement it?
As far as I can tell, Linux NFSD does not yet implement CB_RECALL_SLOT.
In the latter case, it is up to the client to send out enough SEQUENCE
operations on the forward channel to implicitly acknowledges the change
in slots using the sa_highestslot field (see RFC8881, Section 20.8.3).
If the client was completely idle when it received the CB_RECALL_SLOT,
it should only need to send out 1 extra SEQUENCE op, but if using RDMA,
then it has to keep pounding out "RDMA send" messages until the RDMA
credit count has been brought down too.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx