Re: list_del corruption while iterating retry_list in cifs_reconnect still seen on 5.4-rc3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 18, 2019 at 4:59 PM Pavel Shilovskiy <pshilov@xxxxxxxxxxxxx> wrote:
>
> Thanks for the good news that the patch is stable in your workload!
>
The attached patch I ran on top of 5.4-rc3 for over 5 hrs today on the
reboot test - before it would crash after a few minutes tops.

> The extra flag may not be necessary and we may rely on a MID state but we would need to handle two states actually: MID_RETRY_NEEDED and MID_SHUTDOWN - see clean_demultiplex_info() which is doing the same things with mid as cifs_reconnect(). Please add ref counting to both functions since they both can race with system call threads.
>
> I also think that we need to create as smaller patch as possible to avoid hidden regressions. That's why I don't think we should change IF() to WARN_ON() in the same patch and keep  it separately without the stable tag.
>
IMO that 'if' statement is wrong, and should be removed unless it can
be defended.  Why are we _conditionally_ setting the state to
MID_RETRY_NEEDED in the same loop as we're putting mids on retry_list?
 What's the state machine supposed to be doing if it's ambiguous?

> Another general thought is that including extra logic into the MID state may complicate the code. Having a flag like MID_QUEUED would reflect the meaning more straightforward: if mis is queued then de-queue it (aka remove it from the list), else - skip this step. This may be changed later if you think this will complicate the small stable patch.
>

You all know better than me.  I'll take another look next week and
look forward to more discussion.

> --
> Best regards,
> Pavel Shilovsky
>
> -----Original Message-----
> From: David Wysochanski <dwysocha@xxxxxxxxxx>
> Sent: Friday, October 18, 2019 3:12 AM
> To: Ronnie Sahlberg <lsahlber@xxxxxxxxxx>
> Cc: Pavel Shilovskiy <pshilov@xxxxxxxxxxxxx>; linux-cifs <linux-cifs@xxxxxxxxxxxxxxx>; Frank Sorenson <sorenson@xxxxxxxxxx>
> Subject: Re: list_del corruption while iterating retry_list in cifs_reconnect still seen on 5.4-rc3
>
> On Fri, Oct 18, 2019 at 5:27 AM Ronnie Sahlberg <lsahlber@xxxxxxxxxx> wrote:
> >
> >
> > ----- Original Message -----
> > > From: "David Wysochanski" <dwysocha@xxxxxxxxxx>
> > > To: "Ronnie Sahlberg" <lsahlber@xxxxxxxxxx>
> > > Cc: "Pavel Shilovskiy" <pshilov@xxxxxxxxxxxxx>, "linux-cifs" <linux-cifs@xxxxxxxxxxxxxxx>, "Frank Sorenson"
> > > <sorenson@xxxxxxxxxx>
> > > Sent: Friday, 18 October, 2019 6:16:45 PM
> > > Subject: Re: list_del corruption while iterating retry_list in
> > > cifs_reconnect still seen on 5.4-rc3
> > >
> > > On Thu, Oct 17, 2019 at 6:53 PM Ronnie Sahlberg <lsahlber@xxxxxxxxxx> wrote:
> > > >
> > > >
> > > >
> > > >
> >
> > Good comments.
> > New version of the patch, please test and see comments inline below
> >
> > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index
> > bdea4b3e8005..8a78358693a5 100644
> > --- a/fs/cifs/connect.c
> > +++ b/fs/cifs/connect.c
> > @@ -564,8 +564,13 @@ cifs_reconnect(struct TCP_Server_Info *server)
> >         spin_lock(&GlobalMid_Lock);
> >         list_for_each_safe(tmp, tmp2, &server->pending_mid_q) {
> >                 mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
> > -               if (mid_entry->mid_state == MID_REQUEST_SUBMITTED)
> > -                       mid_entry->mid_state = MID_RETRY_NEEDED;
> > +               kref_get(&mid_entry->refcount);
> > +               WARN_ON(mid_entry->mid_state != MID_REQUEST_SUBMITTED);
> > +               /*
> > +                * Set MID_RETRY_NEEDED to prevent the demultiplex loop from
> > +                * removing us, or our neighbours, from the linked list.
> > +                */
> > +               mid_entry->mid_state = MID_RETRY_NEEDED;
> >                 list_move(&mid_entry->qhead, &retry_list);
> >         }
> >         spin_unlock(&GlobalMid_Lock);
> > @@ -575,7 +580,9 @@ cifs_reconnect(struct TCP_Server_Info *server)
> >         list_for_each_safe(tmp, tmp2, &retry_list) {
> >                 mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
> >                 list_del_init(&mid_entry->qhead);
> > +
> >                 mid_entry->callback(mid_entry);
> > +               cifs_mid_q_entry_release(mid_entry);
> >         }
> >
> >         if (cifs_rdma_enabled(server)) { @@ -895,7 +902,7 @@
> > dequeue_mid(struct mid_q_entry *mid, bool malformed)
> >         if (mid->mid_flags & MID_DELETED)
> >                 printk_once(KERN_WARNING
> >                             "trying to dequeue a deleted mid\n");
> > -       else
> > +       else if (mid->mid_state != MID_RETRY_NEEDED)
>
> I'm just using an 'if' here not 'else if'.  Do you see any issue with that?
>
> Actually this section needed a little of reorganizing due to the setting of the mid_state.  So I have this now for this hunk:
>
>         mid->when_received = jiffies;
>  #endif
>         spin_lock(&GlobalMid_Lock);
> -       if (!malformed)
> -               mid->mid_state = MID_RESPONSE_RECEIVED;
> -       else
> -               mid->mid_state = MID_RESPONSE_MALFORMED;
>         /*
>          * Trying to handle/dequeue a mid after the send_recv()
>          * function has finished processing it is a bug.
> @@ -895,8 +893,14 @@ static inline int reconn_setup_dfs_targets(struct cifs_sb_info *cifs_sb,
>         if (mid->mid_flags & MID_DELETED)
>                 printk_once(KERN_WARNING
>                             "trying to dequeue a deleted mid\n");
> -       else
> +       if (mid->mid_state != MID_RETRY_NEEDED)
>                 list_del_init(&mid->qhead);
> +
> +       if (!malformed)
> +               mid->mid_state = MID_RESPONSE_RECEIVED;
> +       else
> +               mid->mid_state = MID_RESPONSE_MALFORMED;
> +
>         spin_unlock(&GlobalMid_Lock);
>  }
>
>
>
> >                 list_del_init(&mid->qhead);
> >         spin_unlock(&GlobalMid_Lock);
> >  }
> > diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c index
> > 308ad0f495e1..17a430b58673 100644
> > --- a/fs/cifs/transport.c
> > +++ b/fs/cifs/transport.c
> > @@ -173,7 +173,8 @@ void
> >  cifs_delete_mid(struct mid_q_entry *mid)  {
> >         spin_lock(&GlobalMid_Lock);
> > -       list_del_init(&mid->qhead);
> > +       if (mid->mid_state != MID_RETRY_NEEDED)
> > +               list_del_init(&mid->qhead);
> >         mid->mid_flags |= MID_DELETED;
> >         spin_unlock(&GlobalMid_Lock);
> >
> > @@ -872,7 +873,8 @@ cifs_sync_mid_result(struct mid_q_entry *mid, struct TCP_Server_Info *server)
> >                 rc = -EHOSTDOWN;
> >                 break;
> >         default:
> > -               list_del_init(&mid->qhead);
> > +               if (mid->mid_state != MID_RETRY_NEEDED)
> > +                       list_del_init(&mid->qhead);
> >                 cifs_server_dbg(VFS, "%s: invalid mid state mid=%llu state=%d\n",
> >                          __func__, mid->mid, mid->mid_state);
> >                 rc = -EIO;
> >
> >
> >
> >
> >
> >
> > > >
> > > > ----- Original Message -----
> > > > > From: "Pavel Shilovskiy" <pshilov@xxxxxxxxxxxxx>
> > > > > To: "Ronnie Sahlberg" <lsahlber@xxxxxxxxxx>, "David Wysochanski"
> > > > > <dwysocha@xxxxxxxxxx>
> > > > > Cc: "linux-cifs" <linux-cifs@xxxxxxxxxxxxxxx>, "Frank Sorenson"
> > > > > <sorenson@xxxxxxxxxx>
> > > > > Sent: Friday, 18 October, 2019 8:02:23 AM
> > > > > Subject: RE: list_del corruption while iterating retry_list in
> > > > > cifs_reconnect still seen on 5.4-rc3
> > > > >
> > > > > Ok, looking at cifs_delete_mid():
> > > > >
> > > > >  172 void
> > > > >  173 cifs_delete_mid(struct mid_q_entry *mid)
> > > > >  174 {
> > > > >  175 >-------spin_lock(&GlobalMid_Lock);
> > > > >  176 >-------list_del_init(&mid->qhead);
> > > > >  177 >-------mid->mid_flags |= MID_DELETED;
> > > > >  178 >-------spin_unlock(&GlobalMid_Lock);
> > > > >  179
> > > > >  180 >-------DeleteMidQEntry(mid);
> > > > >  181 }
> > > > >
> > > > > So, regardless of us taking references on the mid itself or not,
> > > > > the mid might be removed from the list. I also don't think
> > > > > taking GlobalMid_Lock would help much because the next mid in
> > > > > the list might be deleted from the list by another process while
> > > > > cifs_reconnect is calling callback for the current mid.
> > > > >
> > >
> > > Yes the above is consistent with my tracing the crash after the first
> > > initial refcount patch was applied.
> > > After the simple refcount patch, when iterating the retry_loop, it was
> > > processing an orphaned list with a single item over and over and
> > > eventually ran itself down to refcount == 0 and crashed like before.
> > >
> > >
> > > > > Instead, shouldn't we try marking the mid as being reconnected? Once we
> > > > > took
> > > > > a reference, let's mark mid->mid_flags with a new flag MID_RECONNECT
> > > > > under
> > > > > the GlobalMid_Lock. Then modify cifs_delete_mid() to check for this flag
> > > > > and
> > > > > do not remove the mid from the list if the flag exists.
> > > >
> > > > That could work. But then we should also use that flag to suppress the
> > > > other places where we do a list_del*, so something like this ?
> > > >
> > > > diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
> > > > index 50dfd9049370..b324fff33e53 100644
> > > > --- a/fs/cifs/cifsglob.h
> > > > +++ b/fs/cifs/cifsglob.h
> > > > @@ -1702,6 +1702,7 @@ static inline bool is_retryable_error(int error)
> > > >  /* Flags */
> > > >  #define   MID_WAIT_CANCELLED    1 /* Cancelled while waiting for response
> > > >  */
> > > >  #define   MID_DELETED            2 /* Mid has been dequeued/deleted */
> > > > +#define   MID_RECONNECT          4 /* Mid is being used during reconnect
> > > > */
> > > >
> > > Do we need this extra flag?  Can just use  mid_state ==
> > > MID_RETRY_NEEDED in the necessary places?
> >
> > That is a good point.
> > It saves us a redundant flag.
> >
> > >
> > >
> > > >  /* Types of response buffer returned from SendReceive2 */
> > > >  #define   CIFS_NO_BUFFER        0    /* Response buffer not returned */
> > > > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> > > > index bdea4b3e8005..b142bd2a3ef5 100644
> > > > --- a/fs/cifs/connect.c
> > > > +++ b/fs/cifs/connect.c
> > > > @@ -564,6 +564,8 @@ cifs_reconnect(struct TCP_Server_Info *server)
> > > >         spin_lock(&GlobalMid_Lock);
> > > >         list_for_each_safe(tmp, tmp2, &server->pending_mid_q) {
> > > >                 mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
> > > > +               kref_get(&mid_entry->refcount);
> > > > +               mid_entry->mid_flags |= MID_RECONNECT;
> > > >                 if (mid_entry->mid_state == MID_REQUEST_SUBMITTED)
> > > >                         mid_entry->mid_state = MID_RETRY_NEEDED;
> > >
> > > What happens if the state is wrong going in there, and it is not set
> > > to MID_RETRY_NEEDED, but yet we queue up the retry_list and run it
> > > below?
> > > Should the above 'if' check for MID_REQUEST_SUBMITTED be a WARN_ON
> > > followed by unconditionally setting the state?
> > >
> > > WARN_ON(mid_entry->mid_state != MID_REQUEST_SUBMITTED);
> > > /* Unconditionally set MID_RETRY_NEEDED */
> > > mid_etnry->mid_state = MID_RETRY_NEEDED;
> >
> > Yepp.
> >
> > >
> > >
> > > >                 list_move(&mid_entry->qhead, &retry_list);
> > > > @@ -575,7 +577,9 @@ cifs_reconnect(struct TCP_Server_Info *server)
> > > >         list_for_each_safe(tmp, tmp2, &retry_list) {
> > > >                 mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
> > > >                 list_del_init(&mid_entry->qhead);
> > > > +
> > > >                 mid_entry->callback(mid_entry);
> > > > +               cifs_mid_q_entry_release(mid_entry);
> > > >         }
> > > >
> > > >         if (cifs_rdma_enabled(server)) {
> > > > @@ -895,7 +899,7 @@ dequeue_mid(struct mid_q_entry *mid, bool malformed)
> > > >         if (mid->mid_flags & MID_DELETED)
> > > >                 printk_once(KERN_WARNING
> > > >                             "trying to dequeue a deleted mid\n");
> > > > -       else
> > > > +       else if (!(mid->mid_flags & MID_RECONNECT))
> > >
> > > Instead of the above,
> > >
> > >  -       else
> > > +          else if (mid_entry->mid_state == MID_RETRY_NEEDED)
> >
> > Yes, but mid_state != MID_RETRY_NEEDED
> >
>
> Yeah good catch on that - somehow I reversed the logic, and when I
> tested the former it blew up spectacularly almost instantaenously!
> Doh!
>
> So far the latest patch has been running for about 25 minutes, which
> is I think the longest this test has survived.
> I need a bit more runtime to be sure it's good, but if it keeps going
> I'll plan to create a patch header and submit to list by end of today.
> Thanks Ronnie and Pavel for the help tracking this down.
>
>
>
>
>
>
From 32def41aac71b227dc11a5988754cbda4ba9ad8a Mon Sep 17 00:00:00 2001
From: Dave Wysochanski <dwysocha@xxxxxxxxxx>
Date: Fri, 18 Oct 2019 04:28:56 -0400
Subject: [PATCH] cifs: Fix list_del corruption of retry_list in cifs_reconnect

There's a race between the demultiplexer thread and the request
issuing thread similar to the race described in
commit 696e420bb2a6 ("cifs: Fix use after free of a mid_q_entry")
where both threads may obtain and attempt to call list_del_init
on the same mid and a list_del corruption similar to the
following will result:

[  430.454897] list_del corruption. prev->next should be ffff98d3a8f316c0, but was 2e885cb266355469
[  430.464668] ------------[ cut here ]------------
[  430.466569] kernel BUG at lib/list_debug.c:51!
[  430.468476] invalid opcode: 0000 [#1] SMP PTI
[  430.470286] CPU: 0 PID: 13267 Comm: cifsd Kdump: loaded Not tainted 5.4.0-rc3+ #19
[  430.473472] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[  430.475872] RIP: 0010:__list_del_entry_valid.cold+0x31/0x55
[  430.478129] Code: 5e 15 8e e8 54 a3 c5 ff 0f 0b 48 c7 c7 78 5f 15 8e e8 46 a3 c5 ff 0f 0b 48 89 f2 48 89 fe 48 c7 c7 38 5f 15 8e e8 32 a3 c5 ff <0f> 0b 48 89 fe 4c 89 c2 48 c7 c7 00 5f 15 8e e8 1e a3 c5 ff 0f 0b
[  430.485563] RSP: 0018:ffffb4db0042fd38 EFLAGS: 00010246
[  430.487665] RAX: 0000000000000054 RBX: ffff98d3aabb8800 RCX: 0000000000000000
[  430.490513] RDX: 0000000000000000 RSI: ffff98d3b7a17908 RDI: ffff98d3b7a17908
[  430.493383] RBP: ffff98d3a8f316c0 R08: ffff98d3b7a17908 R09: 0000000000000285
[  430.496258] R10: ffffb4db0042fbf0 R11: ffffb4db0042fbf5 R12: ffff98d3aabb89c0
[  430.499113] R13: ffffb4db0042fd48 R14: 2e885cb266355469 R15: ffff98d3b24c4480
[  430.501981] FS:  0000000000000000(0000) GS:ffff98d3b7a00000(0000) knlGS:0000000000000000
[  430.505232] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  430.507546] CR2: 00007f08cd17b9c0 CR3: 000000023484a000 CR4: 00000000000406f0
[  430.510426] Call Trace:
[  430.511500]  cifs_reconnect+0x25e/0x610 [cifs]
[  430.513350]  cifs_readv_from_socket+0x220/0x250 [cifs]
[  430.515464]  cifs_read_from_socket+0x4a/0x70 [cifs]
[  430.517452]  ? try_to_wake_up+0x212/0x650
[  430.519122]  ? cifs_small_buf_get+0x16/0x30 [cifs]
[  430.521086]  ? allocate_buffers+0x66/0x120 [cifs]
[  430.523019]  cifs_demultiplex_thread+0xdc/0xc30 [cifs]
[  430.525116]  kthread+0xfb/0x130
[  430.526421]  ? cifs_handle_standard+0x190/0x190 [cifs]
[  430.528514]  ? kthread_park+0x90/0x90
[  430.530019]  ret_from_fork+0x35/0x40

To fix the above, inside cifs_reconnect unconditionally set the
state to MID_RETRY_NEEDED, and then take a reference before we
move any mid_q_entry on server->pending_mid_q to the temporary
retry_list.  Then while processing retry_list drop the reference
after the mid_q_entry callback has been completed.  In the code
paths for request issuing thread, avoid calling list_del_init
if we notice mid->mid_state != MID_RETRY_NEEDED, avoiding the
race and duplicate call to list_del_init.

Signed-off-by: Dave Wysochanski <dwysocha@xxxxxxxxxx>
---
 fs/cifs/connect.c   | 18 +++++++++++-------
 fs/cifs/transport.c |  6 ++++--
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index a64dfa95a925..c8b8d4efe5a4 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -564,8 +564,9 @@ static inline int reconn_setup_dfs_targets(struct cifs_sb_info *cifs_sb,
 	spin_lock(&GlobalMid_Lock);
 	list_for_each_safe(tmp, tmp2, &server->pending_mid_q) {
 		mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
-		if (mid_entry->mid_state == MID_REQUEST_SUBMITTED)
-			mid_entry->mid_state = MID_RETRY_NEEDED;
+		kref_get(&mid_entry->refcount);
+		WARN_ON(mid_entry->mid_state != MID_REQUEST_SUBMITTED);
+		mid_entry->mid_state = MID_RETRY_NEEDED;
 		list_move(&mid_entry->qhead, &retry_list);
 	}
 	spin_unlock(&GlobalMid_Lock);
@@ -576,6 +577,7 @@ static inline int reconn_setup_dfs_targets(struct cifs_sb_info *cifs_sb,
 		mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
 		list_del_init(&mid_entry->qhead);
 		mid_entry->callback(mid_entry);
+		cifs_mid_q_entry_release(mid_entry);
 	}
 
 	if (cifs_rdma_enabled(server)) {
@@ -884,10 +886,6 @@ static inline int reconn_setup_dfs_targets(struct cifs_sb_info *cifs_sb,
 	mid->when_received = jiffies;
 #endif
 	spin_lock(&GlobalMid_Lock);
-	if (!malformed)
-		mid->mid_state = MID_RESPONSE_RECEIVED;
-	else
-		mid->mid_state = MID_RESPONSE_MALFORMED;
 	/*
 	 * Trying to handle/dequeue a mid after the send_recv()
 	 * function has finished processing it is a bug.
@@ -895,8 +893,14 @@ static inline int reconn_setup_dfs_targets(struct cifs_sb_info *cifs_sb,
 	if (mid->mid_flags & MID_DELETED)
 		printk_once(KERN_WARNING
 			    "trying to dequeue a deleted mid\n");
-	else
+	if (mid->mid_state != MID_RETRY_NEEDED)
 		list_del_init(&mid->qhead);
+
+	if (!malformed)
+		mid->mid_state = MID_RESPONSE_RECEIVED;
+	else
+		mid->mid_state = MID_RESPONSE_MALFORMED;
+
 	spin_unlock(&GlobalMid_Lock);
 }
 
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index 308ad0f495e1..17a430b58673 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -173,7 +173,8 @@ void cifs_mid_q_entry_release(struct mid_q_entry *midEntry)
 cifs_delete_mid(struct mid_q_entry *mid)
 {
 	spin_lock(&GlobalMid_Lock);
-	list_del_init(&mid->qhead);
+	if (mid->mid_state != MID_RETRY_NEEDED)
+		list_del_init(&mid->qhead);
 	mid->mid_flags |= MID_DELETED;
 	spin_unlock(&GlobalMid_Lock);
 
@@ -872,7 +873,8 @@ struct mid_q_entry *
 		rc = -EHOSTDOWN;
 		break;
 	default:
-		list_del_init(&mid->qhead);
+		if (mid->mid_state != MID_RETRY_NEEDED)
+			list_del_init(&mid->qhead);
 		cifs_server_dbg(VFS, "%s: invalid mid state mid=%llu state=%d\n",
 			 __func__, mid->mid, mid->mid_state);
 		rc = -EIO;
-- 
1.8.3.1


[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux