On Sun, 2021-04-25 at 02:29 +0000, Rick Macklem wrote: > Hi, > > I have been running a simple test using two clients (one FreeBSD > and the other Linux, Ferdora Core30, 5.2 kernel) with delegations > enabled in the server. > > The test consists of running the connectathon general tests > alternately om each client, using the same directory on the > server. > --> As such, each one results in CB_RECALLs of delegations > from the other client. > Everything seems fine until the server does multiple concurrent > CB_RECALLs for different files/delegations using different > callback session slots. > --> Then the Linux client decides it must create a new connection, > which breaks the back channel. > After 0.1sec, the FreeBSD server notices the broken back > channel and starts setting SEQ4_STATUS_CB_PATH_DOWN. > --> 15sec after that, the Linux client does a > BindConnectionToSession > and things start working again. > > The mystery to me is why the client decides to create a new TCP > connection, forcing this 15sec hickup each time it happens? > > If you are interested in looking at a packet capture. you can > % fetch https://people.freebsd.org/~rmacklem/twoclientdeleg.pcap > There are multiple examples in it. One is at: > packet# 3518, 3520, 3521 CB_RECALL requests for 3 different > delegations > time 137.5 > --> This is followed by a close and open of a new TCP connection... > packet# 3582 - first one with SEQ4_STATUS_CB_PATH_DOWN at > time 137.6 > packet# 3604 - client does a bindconnectiontosession at > time 152.7 > Then things start to happen again... > 192.168.1.5 - FreeBSD server > 192.168.1.6 - Linux client > 192.168.1.13 - FreeBSD client > > If this is a known issue that you think is fixed in a more recent > Linux kernel, then sorry about the noise. > Should have been fixed in Linux 5.3 by commit 7402a4fedc2b ("SUNRPC: Fix up backchannel slot table accounting") AFAICT. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx