Re: BUG - raid 1 deadlock on handle_read_error / wait_barrier

Alexander Lyakas <alex.bolshoy@xxxxxxxxx> · Thu, 16 May 2013 17:07:04 +0300

Hello Neil,
we are hitting issue that looks very similar; we are on kernel 3.8.2.
The array is a 2-device raid1, which experienced a drive failure, but
then drive was removed and re-added back to the array (although
rebuild never started). Relevant parts of the kernel log:

May 16 11:12:14  kernel: [46850.090499] md/raid1:md2: Disk failure on
dm-6, disabling device.
May 16 11:12:14  kernel: [46850.090499] md/raid1:md2: Operation
continuing on 1 devices.
May 16 11:12:14  kernel: [46850.090511] md: super_written gets
error=-5, uptodate=0
May 16 11:12:14  kernel: [46850.090676] md/raid1:md2: dm-6:
rescheduling sector 18040736
May 16 11:12:14  kernel: [46850.090764] md/raid1:md2: dm-6:
rescheduling sector 20367040
May 16 11:12:14  kernel: [46850.090826] md/raid1:md2: dm-6:
rescheduling sector 17613504
May 16 11:12:14  kernel: [46850.090883] md/raid1:md2: dm-6:
rescheduling sector 18042720
May 16 11:12:15  kernel: [46850.229970] md/raid1:md2: redirecting
sector 18040736 to other mirror: dm-13
May 16 11:12:15  kernel: [46850.257687] md/raid1:md2: redirecting
sector 20367040 to other mirror: dm-13
May 16 11:12:15  kernel: [46850.268731] md/raid1:md2: redirecting
sector 17613504 to other mirror: dm-13
May 16 11:12:15  kernel: [46850.398242] md/raid1:md2: redirecting
sector 18042720 to other mirror: dm-13
May 16 11:12:23  kernel: [46858.448465] md: unbind<dm-6>
May 16 11:12:23  kernel: [46858.456081] md: export_rdev(dm-6)
May 16 11:23:19  kernel: [47515.062547] md: bind<dm-6>

May 16 11:24:28  kernel: [47583.920126] INFO: task md2_raid1:15749
blocked for more than 60 seconds.
May 16 11:24:28  kernel: [47583.921829] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 16 11:24:28  kernel: [47583.923361] md2_raid1       D
0000000000000001     0 15749      2 0x00000000
May 16 11:24:28  kernel: [47583.923367]  ffff880097c23c28
0000000000000046 ffff880000000002 00000000982c43b8
May 16 11:24:28  kernel: [47583.923372]  ffff880097c23fd8
ffff880097c23fd8 ffff880097c23fd8 0000000000013f40
May 16 11:24:28  kernel: [47583.923376]  ffff880119b11740
ffff8800a5489740 ffff880097c23c38 ffff88011609d3c0
May 16 11:24:28  kernel: [47583.923381] Call Trace:
May 16 11:24:28  kernel: [47583.923395]  [<ffffffff816eb399>] schedule+0x29/0x70
May 16 11:24:28  kernel: [47583.923402]  [<ffffffffa0516736>]
raise_barrier+0x106/0x160 [raid1]
May 16 11:24:28  kernel: [47583.923410]  [<ffffffff8107fcc0>] ?
add_wait_queue+0x60/0x60
May 16 11:24:28  kernel: [47583.923415]  [<ffffffffa0516af7>]
raid1_add_disk+0x197/0x200 [raid1]
May 16 11:24:28  kernel: [47583.923421]  [<ffffffff81567fa4>]
remove_and_add_spares+0x104/0x220
May 16 11:24:28  kernel: [47583.923426]  [<ffffffff8156a02d>]
md_check_recovery.part.49+0x40d/0x530
May 16 11:24:28  kernel: [47583.923429]  [<ffffffff8156a165>]
md_check_recovery+0x15/0x20
May 16 11:24:28  kernel: [47583.923433]  [<ffffffffa0517e42>]
raid1d+0x22/0x180 [raid1]
May 16 11:24:28  kernel: [47583.923439]  [<ffffffff81045cd9>] ?
default_spin_lock_flags+0x9/0x10
May 16 11:24:28  kernel: [47583.923443]  [<ffffffff81045cd9>] ?
default_spin_lock_flags+0x9/0x10
May 16 11:24:28  kernel: [47583.923449]  [<ffffffff815624ed>]
md_thread+0x10d/0x140
May 16 11:24:28  kernel: [47583.923453]  [<ffffffff8107fcc0>] ?
add_wait_queue+0x60/0x60
May 16 11:24:28  kernel: [47583.923457]  [<ffffffff815623e0>] ?
md_rdev_init+0x140/0x140
May 16 11:24:28  kernel: [47583.923461]  [<ffffffff8107f0d0>] kthread+0xc0/0xd0
May 16 11:24:28  kernel: [47583.923465]  [<ffffffff8107f010>] ?
flush_kthread_worker+0xb0/0xb0
May 16 11:24:28  kernel: [47583.923472]  [<ffffffff816f506c>]
ret_from_fork+0x7c/0xb0
May 16 11:24:28  kernel: [47583.923476]  [<ffffffff8107f010>] ?
flush_kthread_worker+0xb0/0xb0

dm-13 is the good drive, dm-6 is the one that failed.

At this point, we have several threads calling submit_bio and all
stuck like this:

cat /proc/16218/stack
[<ffffffffa0516d6e>] wait_barrier+0xbe/0x160 [raid1]
[<ffffffffa0518627>] make_request+0x87/0xa90 [raid1]
[<ffffffff81561ed0>] md_make_request+0xd0/0x200
[<ffffffff8132bcaa>] generic_make_request+0xca/0x100
[<ffffffff8132bd5b>] submit_bio+0x7b/0x160
...

And md raid1 thread stuck like this:

cat /proc/15749/stack
[<ffffffffa0516736>] raise_barrier+0x106/0x160 [raid1]
[<ffffffffa0516af7>] raid1_add_disk+0x197/0x200 [raid1]
[<ffffffff81567fa4>] remove_and_add_spares+0x104/0x220
[<ffffffff8156a02d>] md_check_recovery.part.49+0x40d/0x530
[<ffffffff8156a165>] md_check_recovery+0x15/0x20
[<ffffffffa0517e42>] raid1d+0x22/0x180 [raid1]
[<ffffffff815624ed>] md_thread+0x10d/0x140
[<ffffffff8107f0d0>] kthread+0xc0/0xd0
[<ffffffff816f506c>] ret_from_fork+0x7c/0xb0
[<ffffffffffffffff>] 0xffffffffffffffff

We have also two user-space threads stuck:

one is trying to read /sys/block/md2/md/array_state and its kernel stack is:
# cat /proc/2251/stack
[<ffffffff81564602>] md_attr_show+0x72/0xf0
[<ffffffff8120f116>] fill_read_buffer.isra.8+0x66/0xf0
[<ffffffff8120f244>] sysfs_read_file+0xa4/0xc0
[<ffffffff8119b0d0>] vfs_read+0xb0/0x180
[<ffffffff8119b1f2>] sys_read+0x52/0xa0
[<ffffffff816f511d>] system_call_fastpath+0x1a/0x1f
[<ffffffffffffffff>] 0xffffffffffffffff

the other wants to read from /proc/mdstat and is:
[<ffffffff81563d2b>] md_seq_show+0x4b/0x540
[<ffffffff811bdd1b>] seq_read+0x16b/0x400
[<ffffffff811ff572>] proc_reg_read+0x82/0xc0
[<ffffffff8119b0d0>] vfs_read+0xb0/0x180
[<ffffffff8119b1f2>] sys_read+0x52/0xa0
[<ffffffff816f511d>] system_call_fastpath+0x1a/0x1f
[<ffffffffffffffff>] 0xffffffffffffffff

mdadm --detail also gets stuck if attempted, in stack like this:
cat /proc/2864/stack
[<ffffffff81564602>] md_attr_show+0x72/0xf0
[<ffffffff8120f116>] fill_read_buffer.isra.8+0x66/0xf0
[<ffffffff8120f244>] sysfs_read_file+0xa4/0xc0
[<ffffffff8119b0d0>] vfs_read+0xb0/0x180
[<ffffffff8119b1f2>] sys_read+0x52/0xa0
[<ffffffff816f511d>] system_call_fastpath+0x1a/0x1f
[<ffffffffffffffff>] 0xffffffffffffffff

Might your patch https://patchwork.kernel.org/patch/2260051/ fix this
issue? Is this patch alone applicable to kernel 3.8.2?
Can you pls kindly comment on this.

Thanks for your help,
Alex.

On Tue, Feb 26, 2013 at 4:09 PM, Joe Lawrence <Joe.Lawrence@xxxxxxxxxxx> wrote:
>
> Same here:  after ~15 hrs, ~300 surprise device removals and fio stress,
> no hung tasks to report.
>
> -- Joe
>
> On Mon, 25 Feb 2013, Tregaron Bayly wrote:
>
> > > Actually  don't bother.  I think I've found the problem.  It is related to
> > > pending_count and is easy to fix.
> > > Could you try this patch please?
> > >
> > > Thanks.
> > > NeilBrown
> > >
> > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> > > index 6e5d5a5..fd86b37 100644
> > > --- a/drivers/md/raid1.c
> > > +++ b/drivers/md/raid1.c
> > > @@ -967,6 +967,7 @@ static void raid1_unplug(struct blk_plug_cb *cb, bool from_schedule)
> > >             bio_list_merge(&conf->pending_bio_list, &plug->pending);
> > >             conf->pending_count += plug->pending_cnt;
> > >             spin_unlock_irq(&conf->device_lock);
> > > +           wake_up(&conf->wait_barrier);
> > >             md_wakeup_thread(mddev->thread);
> > >             kfree(plug);
> > >             return;
> >
> > Running 15 hours now and no sign of the problem, which is 12 hours
> > longer than it took to trigger the bug in the past.  I'll continue
> > testing to be sure but I think this patch is a fix.
> >
> > Thanks for the fast response!
> >
> > Tregaron Bayly
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html