On Mon, Feb 26, 2024 at 04:31:16PM +0530, Shyam Prasad N wrote: > On Mon, Feb 26, 2024 at 4:00 PM Harshit Mogalapalli > <harshit.m.mogalapalli@xxxxxxxxxx> wrote: > > > > From: Shyam Prasad N <nspmangalore@xxxxxxxxx> > > > > commit 69cba9d3c1284e0838ae408830a02c4a063104bc upstream. > > > > When the number of responses with status of STATUS_IO_TIMEOUT > > exceeds a specified threshold (NUM_STATUS_IO_TIMEOUT), we reconnect > > the connection. But we do not return the mid, or the credits > > returned for the mid, or reduce the number of in-flight requests. > > > > This bug could result in the server->in_flight count to go bad, > > and also cause a leak in the mids. > > > > This change moves the check to a few lines below where the > > response is decrypted, even of the response is read from the > > transform header. This way, the code for returning the mids > > can be reused. > > > > Also, the cifs_reconnect was reconnecting just the transport > > connection before. In case of multi-channel, this may not be > > what we want to do after several timeouts. Changed that to > > reconnect the session and the tree too. > > > > Also renamed NUM_STATUS_IO_TIMEOUT to a more appropriate name > > MAX_STATUS_IO_TIMEOUT. > > > > Fixes: 8e670f77c4a5 ("Handle STATUS_IO_TIMEOUT gracefully") > > Signed-off-by: Shyam Prasad N <sprasad@xxxxxxxxxxxxx> > > Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx> > > [Harshit: Backport to 5.15.y] > > Conflicts: > > fs/cifs/connect.c -- 5.15.y doesn't have commit 183eea2ee5ba > > ("cifs: reconnect only the connection and not smb session where > > possible") -- User cifs_reconnect(server) instead of > > cifs_reconnect(server, true) > > > > Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@xxxxxxxxxx> > > --- > > Would be nice to get a review from author/maintainer of the upstream patch. > > > > A backport request was made previously but the patch didnot apply > > cleanly then: > > https://lore.kernel.org/all/CANT5p=oPGnCd4H5ppMbAiHsAKMor3LT_aQRqU7tKu=q6q1BGQg@xxxxxxxxxxxxxx/ > > > > xfstests with cifs done: before and after patching with this patch on 5.15.149. > > There is no change in test results before and after the patch. > > > > Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007 > > generic/010 generic/011 generic/013 generic/014 generic/023 generic/024 > > generic/028 generic/029 generic/030 generic/036 generic/069 generic/074 > > generic/075 generic/084 generic/091 generic/095 generic/098 generic/100 > > generic/109 generic/112 generic/113 generic/124 generic/127 generic/129 > > generic/130 generic/132 generic/133 generic/135 generic/141 generic/169 > > generic/198 generic/207 generic/208 generic/210 generic/211 generic/212 > > generic/221 generic/239 generic/241 generic/245 generic/246 generic/247 > > generic/248 generic/249 generic/257 generic/263 generic/285 generic/286 > > generic/308 generic/309 generic/310 generic/315 generic/323 generic/339 > > generic/340 generic/344 generic/345 generic/346 generic/354 generic/360 > > generic/393 generic/394 > > Not run: generic/010 generic/286 generic/315 > > Failures: generic/075 generic/112 generic/127 generic/285 > > Failed 4 of 68 tests > > > > SECTION -- smb3 > > ========================= > > Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007 > > generic/010 generic/011 generic/013 generic/014 generic/023 generic/024 > > generic/028 generic/029 generic/030 generic/036 generic/069 generic/074 > > generic/075 generic/084 generic/091 generic/095 generic/098 generic/100 > > generic/109 generic/112 generic/113 generic/124 generic/127 generic/129 > > generic/130 generic/132 generic/133 generic/135 generic/141 generic/169 > > generic/198 generic/207 generic/208 generic/210 generic/211 generic/212 > > generic/221 generic/239 generic/241 generic/245 generic/246 generic/247 > > generic/248 generic/249 generic/257 generic/263 generic/285 generic/286 > > generic/308 generic/309 generic/310 generic/315 generic/323 generic/339 > > generic/340 generic/344 generic/345 generic/346 generic/354 generic/360 > > generic/393 generic/394 > > Not run: generic/010 generic/014 generic/129 generic/130 generic/239 > > Failures: generic/075 generic/091 generic/112 generic/127 generic/263 generic/285 generic/286 > > Failed 7 of 68 tests > > > > SECTION -- smb21 > > ========================= > > Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007 > > generic/010 generic/011 generic/013 generic/014 generic/023 generic/024 > > generic/028 generic/029 generic/030 generic/036 generic/069 generic/074 > > generic/075 generic/084 generic/091 generic/095 generic/098 generic/100 > > generic/109 generic/112 generic/113 generic/124 generic/127 generic/129 > > generic/130 generic/132 generic/133 generic/135 generic/141 generic/169 > > generic/198 generic/207 generic/208 generic/210 generic/211 generic/212 > > generic/221 generic/239 generic/241 generic/245 generic/246 generic/247 > > generic/248 generic/249 generic/257 generic/263 generic/285 generic/286 > > generic/308 generic/309 generic/310 generic/315 generic/323 generic/339 > > generic/340 generic/344 generic/345 generic/346 generic/354 generic/360 > > generic/393 generic/394 > > Not run: generic/010 generic/014 generic/129 generic/130 generic/239 generic/286 generic/315 > > Failures: generic/075 generic/112 generic/127 generic/285 > > Failed 4 of 68 tests > > > > SECTION -- smb2 > > ========================= > > Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007 > > generic/010 generic/011 generic/013 generic/014 generic/023 generic/024 > > generic/028 generic/029 generic/030 generic/036 generic/069 generic/074 > > generic/075 generic/084 generic/091 generic/095 generic/098 generic/100 > > generic/109 generic/112 generic/113 generic/124 generic/127 generic/129 > > generic/130 generic/132 generic/133 generic/135 generic/141 generic/169 > > generic/198 generic/207 generic/208 generic/210 generic/211 generic/212 > > generic/221 generic/239 generic/241 generic/245 generic/246 generic/247 > > generic/248 generic/249 generic/257 generic/263 generic/285 generic/286 > > generic/308 generic/309 generic/310 generic/315 generic/323 generic/339 > > generic/340 generic/344 generic/345 generic/346 generic/354 generic/360 > > generic/393 generic/394 > > Not run: generic/010 generic/286 generic/315 > > Failures: generic/075 generic/112 generic/127 generic/285 > > Failed 4 of 68 tests > > --- > > fs/cifs/connect.c | 19 +++++++++++++++---- > > 1 file changed, 15 insertions(+), 4 deletions(-) > > > > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c > > index a521c705b0d7..a3e4811b7871 100644 > > --- a/fs/cifs/connect.c > > +++ b/fs/cifs/connect.c > > @@ -59,7 +59,7 @@ extern bool disable_legacy_dialects; > > #define TLINK_IDLE_EXPIRE (600 * HZ) > > > > /* Drop the connection to not overload the server */ > > -#define NUM_STATUS_IO_TIMEOUT 5 > > +#define MAX_STATUS_IO_TIMEOUT 5 > > > > struct mount_ctx { > > struct cifs_sb_info *cifs_sb; > > @@ -965,6 +965,7 @@ cifs_demultiplex_thread(void *p) > > struct mid_q_entry *mids[MAX_COMPOUND]; > > char *bufs[MAX_COMPOUND]; > > unsigned int noreclaim_flag, num_io_timeout = 0; > > + bool pending_reconnect = false; > > > > noreclaim_flag = memalloc_noreclaim_save(); > > cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current)); > > @@ -1004,6 +1005,8 @@ cifs_demultiplex_thread(void *p) > > cifs_dbg(FYI, "RFC1002 header 0x%x\n", pdu_length); > > if (!is_smb_response(server, buf[0])) > > continue; > > + > > + pending_reconnect = false; > > next_pdu: > > server->pdu_size = pdu_length; > > > > @@ -1063,10 +1066,13 @@ cifs_demultiplex_thread(void *p) > > if (server->ops->is_status_io_timeout && > > server->ops->is_status_io_timeout(buf)) { > > num_io_timeout++; > > - if (num_io_timeout > NUM_STATUS_IO_TIMEOUT) { > > - cifs_reconnect(server); > > + if (num_io_timeout > MAX_STATUS_IO_TIMEOUT) { > > + cifs_server_dbg(VFS, > > + "Number of request timeouts exceeded %d. Reconnecting", > > + MAX_STATUS_IO_TIMEOUT); > > + > > + pending_reconnect = true; > > num_io_timeout = 0; > > - continue; > > } > > } > > > > @@ -1113,6 +1119,11 @@ cifs_demultiplex_thread(void *p) > > buf = server->smallbuf; > > goto next_pdu; > > } > > + > > + /* do this reconnect at the very end after processing all MIDs */ > > + if (pending_reconnect) > > + cifs_reconnect(server); > > + > > } /* end while !EXITING */ > > > > /* buffer usually freed in free_mid - need to free it here on exit */ > > -- > > 2.43.0 > > > > These changes look good to me. > Reviewed-by: Shyam Prasad N <sprasad@xxxxxxxxxxxxx> Thanks, now queued up. greg k-h