Re: Spurious instability with NFSoRDMA under moderate load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10.08.2021 19:17, Chuck Lever III wrote:

What I see in this data is that the server is reporting

    SEQ4_STATUS_CB_PATH_DOWN

and the client is attempting to recover (repeatedly) using
BIND_CONN_TO_SESSION. But apparently the recovery didn't
actually work, because the server continues to report a
callback path problem.

[1712389.125641] nfs41_handle_sequence_flag_errors: "10.110.10.200" (client ID 6765f8600a675814) flags=0x00000001
[1712389.129264] nfs4_bind_conn_to_session: bind_conn_to_session was successful for server 10.110.10.200!

[1712389.171953] nfs41_handle_sequence_flag_errors: "10.110.10.200" (client ID 6765f8600a675814) flags=0x00000001
[1712389.178361] nfs4_bind_conn_to_session: bind_conn_to_session was successful for server 10.110.10.200!

[1712389.195606] nfs41_handle_sequence_flag_errors: "10.110.10.200" (client ID 6765f8600a675814) flags=0x00000001
[1712389.203891] nfs4_bind_conn_to_session: bind_conn_to_session was successful for server 10.110.10.200!

I guess it's time to switch to tracing on the server side
to see if you can nail down why the server's callback
requests are failing. On your NFS server, run:

  # trace-cmd record -e nfsd -e sunrpc -e rpcgss -e rpcrdma

at roughly the same point during your test that you captured
the previous client-side trace.

I wonder if reverting 6820bf77864d on the server, to have an easier way to reproduce this state, would be worth it.

Cause it seems like the actual underlying issue is the inability of the NFS server (and/or client) to reestablish the backchannel if it gets disconnected for whatever reason?

Right now I already rebooted the client, and everything is working again, so I'll have to wait a potentially long time for this to happen again otherwise.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux