Re: [PATCH 5.15.y] cifs: fix mid leak during reconnection after timeout threshold

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 26, 2024 at 4:00 PM Harshit Mogalapalli
<harshit.m.mogalapalli@xxxxxxxxxx> wrote:
>
> From: Shyam Prasad N <nspmangalore@xxxxxxxxx>
>
> commit 69cba9d3c1284e0838ae408830a02c4a063104bc upstream.
>
> When the number of responses with status of STATUS_IO_TIMEOUT
> exceeds a specified threshold (NUM_STATUS_IO_TIMEOUT), we reconnect
> the connection. But we do not return the mid, or the credits
> returned for the mid, or reduce the number of in-flight requests.
>
> This bug could result in the server->in_flight count to go bad,
> and also cause a leak in the mids.
>
> This change moves the check to a few lines below where the
> response is decrypted, even of the response is read from the
> transform header. This way, the code for returning the mids
> can be reused.
>
> Also, the cifs_reconnect was reconnecting just the transport
> connection before. In case of multi-channel, this may not be
> what we want to do after several timeouts. Changed that to
> reconnect the session and the tree too.
>
> Also renamed NUM_STATUS_IO_TIMEOUT to a more appropriate name
> MAX_STATUS_IO_TIMEOUT.
>
> Fixes: 8e670f77c4a5 ("Handle STATUS_IO_TIMEOUT gracefully")
> Signed-off-by: Shyam Prasad N <sprasad@xxxxxxxxxxxxx>
> Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx>
> [Harshit: Backport to 5.15.y]
>  Conflicts:
>         fs/cifs/connect.c -- 5.15.y doesn't have commit 183eea2ee5ba
>         ("cifs: reconnect only the connection and not smb session where
>  possible") -- User cifs_reconnect(server) instead of
> cifs_reconnect(server, true)
>
> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@xxxxxxxxxx>
> ---
> Would be nice to get a review from author/maintainer of the upstream patch.
>
> A backport request was made previously but the patch didnot apply
> cleanly then:
> https://lore.kernel.org/all/CANT5p=oPGnCd4H5ppMbAiHsAKMor3LT_aQRqU7tKu=q6q1BGQg@xxxxxxxxxxxxxx/
>
> xfstests with cifs done: before and after patching with this patch on 5.15.149.
> There is no change in test results before and after the patch.
>
> Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007
> generic/010 generic/011 generic/013 generic/014 generic/023 generic/024
> generic/028 generic/029 generic/030 generic/036 generic/069 generic/074
> generic/075 generic/084 generic/091 generic/095 generic/098 generic/100
> generic/109 generic/112 generic/113 generic/124 generic/127 generic/129
> generic/130 generic/132 generic/133 generic/135 generic/141 generic/169
> generic/198 generic/207 generic/208 generic/210 generic/211 generic/212
> generic/221 generic/239 generic/241 generic/245 generic/246 generic/247
> generic/248 generic/249 generic/257 generic/263 generic/285 generic/286
> generic/308 generic/309 generic/310 generic/315 generic/323 generic/339
> generic/340 generic/344 generic/345 generic/346 generic/354 generic/360
> generic/393 generic/394
> Not run: generic/010 generic/286 generic/315
> Failures: generic/075 generic/112 generic/127 generic/285
> Failed 4 of 68 tests
>
> SECTION       -- smb3
> =========================
> Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007
> generic/010 generic/011 generic/013 generic/014 generic/023 generic/024
> generic/028 generic/029 generic/030 generic/036 generic/069 generic/074
> generic/075 generic/084 generic/091 generic/095 generic/098 generic/100
> generic/109 generic/112 generic/113 generic/124 generic/127 generic/129
> generic/130 generic/132 generic/133 generic/135 generic/141 generic/169
> generic/198 generic/207 generic/208 generic/210 generic/211 generic/212
> generic/221 generic/239 generic/241 generic/245 generic/246 generic/247
> generic/248 generic/249 generic/257 generic/263 generic/285 generic/286
> generic/308 generic/309 generic/310 generic/315 generic/323 generic/339
> generic/340 generic/344 generic/345 generic/346 generic/354 generic/360
> generic/393 generic/394
> Not run: generic/010 generic/014 generic/129 generic/130 generic/239
> Failures: generic/075 generic/091 generic/112 generic/127 generic/263 generic/285 generic/286
> Failed 7 of 68 tests
>
> SECTION       -- smb21
> =========================
> Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007
> generic/010 generic/011 generic/013 generic/014 generic/023 generic/024
> generic/028 generic/029 generic/030 generic/036 generic/069 generic/074
> generic/075 generic/084 generic/091 generic/095 generic/098 generic/100
> generic/109 generic/112 generic/113 generic/124 generic/127 generic/129
> generic/130 generic/132 generic/133 generic/135 generic/141 generic/169
> generic/198 generic/207 generic/208 generic/210 generic/211 generic/212
> generic/221 generic/239 generic/241 generic/245 generic/246 generic/247
> generic/248 generic/249 generic/257 generic/263 generic/285 generic/286
> generic/308 generic/309 generic/310 generic/315 generic/323 generic/339
> generic/340 generic/344 generic/345 generic/346 generic/354 generic/360
> generic/393 generic/394
> Not run: generic/010 generic/014 generic/129 generic/130 generic/239 generic/286 generic/315
> Failures: generic/075 generic/112 generic/127 generic/285
> Failed 4 of 68 tests
>
> SECTION       -- smb2
> =========================
> Ran: cifs/001 generic/001 generic/002 generic/005 generic/006 generic/007
> generic/010 generic/011 generic/013 generic/014 generic/023 generic/024
> generic/028 generic/029 generic/030 generic/036 generic/069 generic/074
> generic/075 generic/084 generic/091 generic/095 generic/098 generic/100
> generic/109 generic/112 generic/113 generic/124 generic/127 generic/129
> generic/130 generic/132 generic/133 generic/135 generic/141 generic/169
> generic/198 generic/207 generic/208 generic/210 generic/211 generic/212
> generic/221 generic/239 generic/241 generic/245 generic/246 generic/247
> generic/248 generic/249 generic/257 generic/263 generic/285 generic/286
> generic/308 generic/309 generic/310 generic/315 generic/323 generic/339
> generic/340 generic/344 generic/345 generic/346 generic/354 generic/360
> generic/393 generic/394
> Not run: generic/010 generic/286 generic/315
> Failures: generic/075 generic/112 generic/127 generic/285
> Failed 4 of 68 tests
> ---
>  fs/cifs/connect.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> index a521c705b0d7..a3e4811b7871 100644
> --- a/fs/cifs/connect.c
> +++ b/fs/cifs/connect.c
> @@ -59,7 +59,7 @@ extern bool disable_legacy_dialects;
>  #define TLINK_IDLE_EXPIRE      (600 * HZ)
>
>  /* Drop the connection to not overload the server */
> -#define NUM_STATUS_IO_TIMEOUT   5
> +#define MAX_STATUS_IO_TIMEOUT   5
>
>  struct mount_ctx {
>         struct cifs_sb_info *cifs_sb;
> @@ -965,6 +965,7 @@ cifs_demultiplex_thread(void *p)
>         struct mid_q_entry *mids[MAX_COMPOUND];
>         char *bufs[MAX_COMPOUND];
>         unsigned int noreclaim_flag, num_io_timeout = 0;
> +       bool pending_reconnect = false;
>
>         noreclaim_flag = memalloc_noreclaim_save();
>         cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current));
> @@ -1004,6 +1005,8 @@ cifs_demultiplex_thread(void *p)
>                 cifs_dbg(FYI, "RFC1002 header 0x%x\n", pdu_length);
>                 if (!is_smb_response(server, buf[0]))
>                         continue;
> +
> +               pending_reconnect = false;
>  next_pdu:
>                 server->pdu_size = pdu_length;
>
> @@ -1063,10 +1066,13 @@ cifs_demultiplex_thread(void *p)
>                 if (server->ops->is_status_io_timeout &&
>                     server->ops->is_status_io_timeout(buf)) {
>                         num_io_timeout++;
> -                       if (num_io_timeout > NUM_STATUS_IO_TIMEOUT) {
> -                               cifs_reconnect(server);
> +                       if (num_io_timeout > MAX_STATUS_IO_TIMEOUT) {
> +                               cifs_server_dbg(VFS,
> +                                               "Number of request timeouts exceeded %d. Reconnecting",
> +                                               MAX_STATUS_IO_TIMEOUT);
> +
> +                               pending_reconnect = true;
>                                 num_io_timeout = 0;
> -                               continue;
>                         }
>                 }
>
> @@ -1113,6 +1119,11 @@ cifs_demultiplex_thread(void *p)
>                         buf = server->smallbuf;
>                         goto next_pdu;
>                 }
> +
> +               /* do this reconnect at the very end after processing all MIDs */
> +               if (pending_reconnect)
> +                       cifs_reconnect(server);
> +
>         } /* end while !EXITING */
>
>         /* buffer usually freed in free_mid - need to free it here on exit */
> --
> 2.43.0
>

These changes look good to me.
Reviewed-by: Shyam Prasad N <sprasad@xxxxxxxxxxxxx>

-- 
Regards,
Shyam





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux