Re: Stuck in md_write_start because MD_SB_CHANGE_PENDING can't be cleared

Xiao Ni <xni@xxxxxxxxxx> · Wed, 6 Sep 2017 21:37:57 -0400 (EDT)

----- Original Message -----
> From: "Xiao Ni" <xni@xxxxxxxxxx>
> To: "NeilBrown" <neilb@xxxxxxxx>, "linux-raid" <linux-raid@xxxxxxxxxxxxxxx>
> Cc: shli@xxxxxxxxxx
> Sent: Tuesday, September 5, 2017 10:15:00 AM
> Subject: Re: Stuck in md_write_start because MD_SB_CHANGE_PENDING can't be cleared
> 
> 
> 
> On 09/05/2017 09:36 AM, NeilBrown wrote:
> > On Mon, Sep 04 2017, Xiao Ni wrote:
> >
> >>
> >> In function handle_stripe:
> >> 4697         if (s.handle_bad_blocks ||
> >> 4698             test_bit(MD_SB_CHANGE_PENDING, &conf->mddev->sb_flags)) {
> >> 4699                 set_bit(STRIPE_HANDLE, &sh->state);
> >> 4700                 goto finish;
> >> 4701         }
> >>
> >> Because MD_SB_CHANGE_PENDING is set, so the stripes can't be handled.
> >>
> > Right, of course.  I see what is happening now.
> >
> > - raid5d cannot complete stripes until the metadata is written
> > - the metadata cannot be written until raid5d gets the mddev_lock
> > - mddev_lock is held by the write to suspend_hi
> > - the write to suspend_hi is waiting for raid5_quiesce
> > - raid5_quiesce is waiting for some stripes to complete.
> >
> > We could declare that ->quiesce(, 1) cannot be called while holding the
> > lock.
> > We could possible allow it but only if md_update_sb() is called first,
> > though that might still be racy.
> >
> > ->quiesce(, 1) is currently called from:
> >   mddev_suspend
> >   suspend_lo_store
> >   suspend_hi_store
> >   __md_stop_writes
> >   mddev_detach
> >   set_bitmap_file
> >   update_array_info (when setting/removing internal bitmap)
> >   md_do_sync
> >
> > and most of those are call with the lock held, or take the lock.
> >
> > Maybe we should *require* that mddev_lock is held when calling
> > ->quiesce() and have ->quiesce() do the metadata update.
> >
> > Something like the following maybe.  Can you test it?
> 
> Hi Neil
> 
> Thanks for the analysis. I need to thing for a while :)
> I already added the patch and the test is running now. It usually needs
> more than 5
> hours to reproduce this problem. I'll let it run more than 24 hours.
> I'll update the test
> result later.

Hi Neil

The problem still exists. But it doesn't show calltrace this time. It
was stuck yesterday. I didn't notice that because there has no calltrace.

echo file raid5.c +p > /sys/kernel/debug/dynamic_debug/control

It shows that raid5d is still spinning.

Regards
Xiao

> 
> Regards
> Xiao
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html