From: Shyam Prasad N <nspmangalore@xxxxxxxxx> commit 69cba9d3c1284e0838ae408830a02c4a063104bc upstream. When the number of responses with status of STATUS_IO_TIMEOUT exceeds a specified threshold (NUM_STATUS_IO_TIMEOUT), we reconnect the connection. But we do not return the mid, or the credits returned for the mid, or reduce the number of in-flight requests. This bug could result in the server->in_flight count to go bad, and also cause a leak in the mids. This change moves the check to a few lines below where the response is decrypted, even of the response is read from the transform header. This way, the code for returning the mids can be reused. Also, the cifs_reconnect was reconnecting just the transport connection before. In case of multi-channel, this may not be what we want to do after several timeouts. Changed that to reconnect the session and the tree too. Also renamed NUM_STATUS_IO_TIMEOUT to a more appropriate name MAX_STATUS_IO_TIMEOUT. Fixes: 8e670f77c4a5 ("Handle STATUS_IO_TIMEOUT gracefully") Signed-off-by: Shyam Prasad N <sprasad@xxxxxxxxxxxxx> Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx> [Harshit: Backport to 5.15.y] Conflicts: fs/cifs/connect.c -- 5.15.y doesn't have commit 183eea2ee5ba ("cifs: reconnect only the connection and not smb session where possible") -- User cifs_reconnect(server) instead of cifs_reconnect(server, true) Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@xxxxxxxxxx> --- Would be nice to get a review from author/maintainer of the upstream patch. A backport request was made previously but the patch didnot apply cleanly then: https://lore.kernel.org/all/CANT5p=oPGnCd4H5ppMbAiHsAKMor3LT_aQRqU7tKu=q6q1BGQg@xxxxxxxxxxxxxx/ xfstests with cifs done: before and after patching with this patch on 5.15.149. There is no change in test results before and after the patch. Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007 generic/010 generic/011 generic/013 generic/014 generic/023 generic/024 generic/028 generic/029 generic/030 generic/036 generic/069 generic/074 generic/075 generic/084 generic/091 generic/095 generic/098 generic/100 generic/109 generic/112 generic/113 generic/124 generic/127 generic/129 generic/130 generic/132 generic/133 generic/135 generic/141 generic/169 generic/198 generic/207 generic/208 generic/210 generic/211 generic/212 generic/221 generic/239 generic/241 generic/245 generic/246 generic/247 generic/248 generic/249 generic/257 generic/263 generic/285 generic/286 generic/308 generic/309 generic/310 generic/315 generic/323 generic/339 generic/340 generic/344 generic/345 generic/346 generic/354 generic/360 generic/393 generic/394 Not run: generic/010 generic/286 generic/315 Failures: generic/075 generic/112 generic/127 generic/285 Failed 4 of 68 tests SECTION -- smb3 ========================= Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007 generic/010 generic/011 generic/013 generic/014 generic/023 generic/024 generic/028 generic/029 generic/030 generic/036 generic/069 generic/074 generic/075 generic/084 generic/091 generic/095 generic/098 generic/100 generic/109 generic/112 generic/113 generic/124 generic/127 generic/129 generic/130 generic/132 generic/133 generic/135 generic/141 generic/169 generic/198 generic/207 generic/208 generic/210 generic/211 generic/212 generic/221 generic/239 generic/241 generic/245 generic/246 generic/247 generic/248 generic/249 generic/257 generic/263 generic/285 generic/286 generic/308 generic/309 generic/310 generic/315 generic/323 generic/339 generic/340 generic/344 generic/345 generic/346 generic/354 generic/360 generic/393 generic/394 Not run: generic/010 generic/014 generic/129 generic/130 generic/239 Failures: generic/075 generic/091 generic/112 generic/127 generic/263 generic/285 generic/286 Failed 7 of 68 tests SECTION -- smb21 ========================= Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007 generic/010 generic/011 generic/013 generic/014 generic/023 generic/024 generic/028 generic/029 generic/030 generic/036 generic/069 generic/074 generic/075 generic/084 generic/091 generic/095 generic/098 generic/100 generic/109 generic/112 generic/113 generic/124 generic/127 generic/129 generic/130 generic/132 generic/133 generic/135 generic/141 generic/169 generic/198 generic/207 generic/208 generic/210 generic/211 generic/212 generic/221 generic/239 generic/241 generic/245 generic/246 generic/247 generic/248 generic/249 generic/257 generic/263 generic/285 generic/286 generic/308 generic/309 generic/310 generic/315 generic/323 generic/339 generic/340 generic/344 generic/345 generic/346 generic/354 generic/360 generic/393 generic/394 Not run: generic/010 generic/014 generic/129 generic/130 generic/239 generic/286 generic/315 Failures: generic/075 generic/112 generic/127 generic/285 Failed 4 of 68 tests SECTION -- smb2 ========================= Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007 generic/010 generic/011 generic/013 generic/014 generic/023 generic/024 generic/028 generic/029 generic/030 generic/036 generic/069 generic/074 generic/075 generic/084 generic/091 generic/095 generic/098 generic/100 generic/109 generic/112 generic/113 generic/124 generic/127 generic/129 generic/130 generic/132 generic/133 generic/135 generic/141 generic/169 generic/198 generic/207 generic/208 generic/210 generic/211 generic/212 generic/221 generic/239 generic/241 generic/245 generic/246 generic/247 generic/248 generic/249 generic/257 generic/263 generic/285 generic/286 generic/308 generic/309 generic/310 generic/315 generic/323 generic/339 generic/340 generic/344 generic/345 generic/346 generic/354 generic/360 generic/393 generic/394 Not run: generic/010 generic/286 generic/315 Failures: generic/075 generic/112 generic/127 generic/285 Failed 4 of 68 tests --- fs/cifs/connect.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index a521c705b0d7..a3e4811b7871 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -59,7 +59,7 @@ extern bool disable_legacy_dialects; #define TLINK_IDLE_EXPIRE (600 * HZ) /* Drop the connection to not overload the server */ -#define NUM_STATUS_IO_TIMEOUT 5 +#define MAX_STATUS_IO_TIMEOUT 5 struct mount_ctx { struct cifs_sb_info *cifs_sb; @@ -965,6 +965,7 @@ cifs_demultiplex_thread(void *p) struct mid_q_entry *mids[MAX_COMPOUND]; char *bufs[MAX_COMPOUND]; unsigned int noreclaim_flag, num_io_timeout = 0; + bool pending_reconnect = false; noreclaim_flag = memalloc_noreclaim_save(); cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current)); @@ -1004,6 +1005,8 @@ cifs_demultiplex_thread(void *p) cifs_dbg(FYI, "RFC1002 header 0x%x\n", pdu_length); if (!is_smb_response(server, buf[0])) continue; + + pending_reconnect = false; next_pdu: server->pdu_size = pdu_length; @@ -1063,10 +1066,13 @@ cifs_demultiplex_thread(void *p) if (server->ops->is_status_io_timeout && server->ops->is_status_io_timeout(buf)) { num_io_timeout++; - if (num_io_timeout > NUM_STATUS_IO_TIMEOUT) { - cifs_reconnect(server); + if (num_io_timeout > MAX_STATUS_IO_TIMEOUT) { + cifs_server_dbg(VFS, + "Number of request timeouts exceeded %d. Reconnecting", + MAX_STATUS_IO_TIMEOUT); + + pending_reconnect = true; num_io_timeout = 0; - continue; } } @@ -1113,6 +1119,11 @@ cifs_demultiplex_thread(void *p) buf = server->smallbuf; goto next_pdu; } + + /* do this reconnect at the very end after processing all MIDs */ + if (pending_reconnect) + cifs_reconnect(server); + } /* end while !EXITING */ /* buffer usually freed in free_mid - need to free it here on exit */ -- 2.43.0