Hello Joe, I think the commit you mention is related to handling read errors, in which case freeze_array is called, and it may hang due to incorrect accounting of IO requests. Also, this commit is only relevant since kernel 4.3. For example, in kernel 3.18 there is no "bio_end_io_list" at all. Looking more at this issue, I don't think this is related to the new freeze_array code using array_frozen since https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/md/raid1.c?id=b364e3d048e49b1d177eb7ee7853e77aa0560464 Because the same plugging infrastructure already existed, for example, in kernel 3.8, but we did not observe similar deadlocks. I will have to dig more to understand how this deadlock is avoided. I am more worried now about the freeze_array deadlock I reported in http://www.spinics.net/lists/raid/msg52678.html This is a real deadlock that we see now. Thanks, Alex. On Thu, Jun 16, 2016 at 6:38 AM, Lawrence, Joe <Joe.Lawrence@xxxxxxxxxxx> wrote: > Hi Alexander, > > Any chance this was handled by commit "raid1: include bio_end_io_list in > nr_queued to prevent freeze_array hang" [1] > > [1] > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/md/raid1.c?id=ccfc7bf1f09d6190ef86693ddc761d5fe3fa47cb > ________________________________ > From: linux-raid-owner@xxxxxxxxxxxxxxx <linux-raid-owner@xxxxxxxxxxxxxxx> on > behalf of Alexander Lyakas <alex.bolshoy@xxxxxxxxx> > Sent: Monday, June 13, 2016 7:02:38 AM > To: Neil Brown; Jes Sorensen; linux-raid > Subject: RAID1: deadlock between freeze_array and blk plug? > > Hello Neil, Jes, > > I wonder if the following deadlock is possible: > > - Caller calls blk_start_plug and wants to submit two WRITE bios > > - First bio successfully calls wait_barrier() and is appended to > plug->pending list > > - Now somebody does freeze_array() > > - freeze_array() unconditionally sets: > conf->array_frozen = 1; > and starts waiting for conf->nr_pending to go down > > - Second WRITE bio calls wait_barrier, but it will wait for > "!conf->array_frozen" until it can proceed > > - Now we have a deadlock: first bio will not be submitted because it > sits on the plug list of the caller, and caller is stuck in > wait_barrier, so it cannot do blk_finish_plug. > > I am about to try to reproduce it on kernel 3.18, but looking at the > latest Linus tree, I don't see something preventing this from > happening either. Am I missing something? > > Thanks, > Alex. > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html