Re: Failure to reconnect after cluster failvoer

Ross Lagerwall <ross.lagerwall@xxxxxxxxxx> · Mon, 25 Feb 2019 13:13:35 +0000

On 2/22/19 11:25 PM, Tom Talpey wrote:
-----Original Message-----
From: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
Sent: Friday, February 22, 2019 9:17 AM
To: Tom Talpey <ttalpey@xxxxxxxxxxxxx>; Steve French
<smfrench@xxxxxxxxx>
Cc: CIFS <linux-cifs@xxxxxxxxxxxxxxx>
Subject: Re: Failure to reconnect after cluster failvoer

On 2/21/19 5:59 PM, Tom Talpey wrote:
The reconnect is apparently using a dotted-quad as the servername, and you
can see the auth is forced to NTLM as a consequence. Is that the way you
initially mounted the share (i.e. mount 10.71.217.50:/smbshare /mnt)?

-----Original Message-----
From: linux-cifs-owner@xxxxxxxxxxxxxxx <linux-cifs-owner@xxxxxxxxxxxxxxx>
On Behalf Of Steve French
Sent: Thursday, February 21, 2019 9:07 AM
To: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
Cc: CIFS <linux-cifs@xxxxxxxxxxxxxxx>
Subject: Re: Failure to reconnect after cluster failvoer

Couple quick thoughts.

Does this work on current kernels (5.0 for example).

Was thinking about patches that might affect this like:
- "cifs: connect to servername instead of IP for IPC$ share"
- "smb3: on reconnect set PreviousSessionId field"
- Paulo's patches (has cifs-utils coreq) to reconnect to new IP
address if hostname's IP address changed and his add support for
failover
- Paulo's patch to remove trailing slashes from server UNC name

I've reproduced this with 5.0-rc7 and the latest cifs-utils from git.
The share was mounted as follows (yes, by IP):

mount.cifs -o
vers=3.0,cache=loose,actimeo=0,username=x,domain=y,password=z
'//10.71.217.31/smbshare' /mnt

Here is the tcpdump when it fails to reconnect properly:
...

The initial connection is at timestamp 0s, reconnection at 13s,
STATUS_NETWORK_NAME_DELETED at 60s.

For comparison, here is a tcpdump using the "fix" from my previous mail:
...

The initial connection is at timestamp 0s, reconnection at 34s,
successful read request at 215s.

Note that the tree connect for IPC$ only happens _after_ the tree
connect for the share succeeds.

Thanks for the full traces, they clarify the situation. But, I don’t see any
meaningful difference in the client behavior. The ordering of the two
treeconnects is the same between the two - initially, "IPC$" then
"smbshare", and on reconnect, the other way around. So, I'm unclear
whether your patch did anything.

There is definitely a difference. Before the patch, on reconnect the client:

* Connects to "smbshare" which fails
* Then connects to "IPC$" which succeeds
* Then tries again to connect to smbshare which fails repeatedly

After the patch, on reconnect the client:

* Connects to "smbshare" which fails
* Then tries again to connect to "smbshare" which succeeds after several 
retries
* Then tries to connect to "IPC$" which succeeds

This subtle reordering somehow makes it work. It may indeed be a server 
bug rather than a client bug. I was hoping someone could shed some light 
on this.

The STATUS_NETWORK_NAME_DELETED is a consequence of the failed
re-establishment of the tree connect, and is not itself the problem. The
server is simply timing out the treeid, since the client did not successfully
reclaim it. The repeated STATUS_BAD_NETWORK_NAME is the issue.

Are you sure the clustered server is recovering properly when you are
forcing the failover? For example, if it's a two-node cluster, maybe node A
can take over node B, but node B has issues taking over node A. Is there
anything relevant in the server logs?

It's a two node cluster. The behaviour happens reliably when failing 
over either way. After failover, the server state is consistent. E.g. 
after a failover from node A to node B, node B shows itself as the 
primary server and the node A is marked as down. I couldn't find 
anything interesting in the server logs.

Thanks,
--
Ross Lagerwall