Re: Reoccurring 5 second delays during NFS calls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 21 Feb 2023, at 8:33, Florian Möller wrote:

> Hi all,
>
> unfortunately the patch did not help, neither using -o async nor using -o sync. We did some more tests (which is the reason for the delay of this reply):
>
> We used a qemu image housing both the NFS server and the client and did some kernel debugging.
> OS: Ubuntu 22.04.1
> Kernel: 5.15.78
> Mount line: rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,
> 	namlen=255,hard,proto=tcp,timeo=600,retrans=2,
> 	sec=krb5p,clientaddr=10.0.0.1,
> 	local_lock=none,addr=10.0.0.1
>
> We touched a file and then touched the file again. This triggers the delay reliably.
>
> We set breakpoints on all functions starting with nfs4 and on "update_open_stateid". The attached file "1sttouch.log" contains a gdb log of the first touch.
> "2ndtouch.log" shows the gdb output of the second touch. The delay occurs in line 116 in update_open_stateid.
>
> We then deleted all breakpoints and set a sole breakpoint on update_open_stateid. We touched the file again and used only the "next" command of gdb. The gdb output is in "2ndtouch-next.log", the delay occurs in line 8.
>
> Please let us know if you need more information or if you want us to perform further tests.
>
> Best regards,
> Florian Möller

Hi Florian,

The 5 second value and location of the delay is making me suspect something
is wrong with the open stateid sequence processing.

The client introduces 5-second delays in order to correctly order stateid
updates from the server.  Usually this happens because there are multiple
processes sending OPEN/CLOSE calls and the server processess them out of
order, or the client receives the responses out of order.

It would be helpful to have a network capture of the problem, along with the
matching output from these tracepoints on the client:

nfs4:nfs4_open_stateid_update
nfs4:nfs4_open_stateid_update_wait
nfs4:nfs4_close_stateid_update_wait
sunrpc:xs_stream_read_request
sunrpc:rpc_request
sunrpc:rpc_task_end

And these tracepoints on the server:

nfsd:nfsd_preprocess
sunrpc:svc_process

I'm interested in seeing how the client is processing the sequence numbers
of the open stateid, or if perhaps there's a delegation in play.

LMK if you need help with the tracepoints -- you can simply append those
lines into /sys/kernel/debug/tracing/set_event, then reproduce the problem.
The output of those tracepoints will be in /sys/kernel/debug/tracing/trace.

Ben





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux