Re: [PATCH] cifs: detect dead connections only when echoes are enabled.

Aurélien Aptel <aaptel@xxxxxxxx> · Thu, 29 Apr 2021 13:46:22 +0200

Hi Shyam,

Ok so I ended up looking at a lot of code to check this... And I'm still
unsure.

First, it looks like server->echoes is only used for smb2 and there is
a generic server->ops->can_echo() you can use that just returns
server->echoes it for smb2.

Shyam Prasad N <nspmangalore@xxxxxxxxx> writes:
> Hi,
>
> Recently, some xfstests and later some manual testing on multi-channel
> revealed that we detect unresponsive servers on some of the channels
> to the same server.
>
> The issue is seen when a channel is setup and sits idle without any
> traffic. Generally, we enable echoes and oplocks on a channel during
> the first request, based on the number of credits available on the
> channel. So on idle channels, we trip in our logic to check server
> unresponsiveness.

That makes sense but while looking at the code I see we always queue
echo request in cifs_get_tcp_session(), which is called when adding a
channel.

cifs_ses_add_channel() {
	ctx.echo_interval = ses->server->echo_interval / HZ;

	chan->server = cifs_get_tcp_session(&ctx);

	rc = cifs_negotiate_protocol(xid, ses) {
		server->tcpStatus = CifsGood;
	}

	rc = cifs_setup_session(xid, ses, cifs_sb->local_nls);
}

cifs_get_tcp_session() {

	INIT_DELAYED_WORK(&tcp_ses->echo, cifs_echo_request);

	tcp_ses->tcpStatus = CifsNeedNegotiate;

	tcp_ses->lstrp = jiffies;

	if (ctx->echo_interval >= SMB_ECHO_INTERVAL_MIN &&
		ctx->echo_interval <= SMB_ECHO_INTERVAL_MAX)
		tcp_ses->echo_interval = ctx->echo_interval * HZ;
	else
		tcp_ses->echo_interval = SMB_ECHO_INTERVAL_DEFAULT * HZ;

	queue_delayed_work(cifsiod_wq, &tcp_ses->echo, tcp_ses->echo_interval);
}

cifs_echo_request() {

	if (server->tcpStatus == CifsNeedNegotiate)
		echo_interval = 0; <=== branch taken
	else
		echo_interval = server->echo_interval;

	if (server->tcpStatus not in {NeedReconnect, Exiting, New}
	   && server->can_echo()
	   && jiffies > server->lstrp + echo_interval - HZ)
	{
		server->echo();
	}

	queue_delayed_work(cifsiod_wq, &server->echo, server->echo_interval);
        ===> echo_interval = 0 so calls itself immediatly
}

SMB2_echo() {
	if (server->tcpStatus == CifsNeedNegotiate) {
		/* No need to send echo on newly established connections */
		mod_delayed_work(cifsiod_wq, &server->reconnect, 0);
                ====> calls smb2_reconnect_server() immediatly if NeedNego
		return rc;
	}

}

smb2_reconnect_server() {
	// channel has no TCONs so it does essentially nothing
}

server_unresponsive() {
	if (server->tcpStatus in {Good, NeedNegotiate}
           && jiffies > server->lstrp + 3 * server->echo_interval)
        {
		cifs_server_dbg(VFS, "has not responded in %lu seconds. Reconnecting...\n",
			 (3 * server->echo_interval) / HZ);
		cifs_reconnect(server);
		return true;
	}
        return false;
}

So it looks to me that cifs_get_tcp_session() starts the
cifs_echo_request() work, which calls itself with no delay in an
infinite loop doing nothing (that's probably bad...) until session_setup
succeeds, after which the delay between the self-call is set.

During session_setup():

* last response time (lstrp) gets updated

* sending/recv requests should interact with the server
  credits and call change_conf(), which should enable server->echoes
  ==> is that part not working?

Once enabled, the echo_request workq will finally send the echo on the
wire, which should -/+ 1 credit and update lstrp.

> Attached a one-line fix for this. Have tested it in my environment.
> Another approach to fix this could be to enable echoes during
> initialization of a server struct. Or soon after the session setup.
> But I felt that this approach is better. Let me know if you feel
> otherwise.

I think the idea of your change is ok but there's probably also an issue
in crediting in session_setup()/change_conf() if echoes is not enabled
at this point no?

Cheers,
-- 
Aurélien Aptel / SUSE Labs Samba Team
GPG: 1839 CB5F 9F5B FB9B AA97  8C99 03C8 A49B 521B D5D3
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, DE
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 247165 (AG München)