Re: Panic at BUG_ON(force && !conf->barrier);

NeilBrown <neilb@xxxxxxxx> · Thu, 30 Aug 2018 14:36:31 +1000

On Wed, Aug 29 2018, Xiao Ni wrote:

> ----- Original Message -----
>> From: "NeilBrown" <neilb@xxxxxxxx>
>> To: "Xiao Ni" <xni@xxxxxxxxxx>, "Shaohua Li" <shli@xxxxxxxxxx>
>> Cc: "linux-raid" <linux-raid@xxxxxxxxxxxxxxx>, "Nigel Croxon" <ncroxon@xxxxxxxxxx>
>> Sent: Thursday, August 30, 2018 8:30:35 AM
>> Subject: Re: Panic at BUG_ON(force && !conf->barrier);
>> 
>> On Thu, Aug 16 2018, Xiao Ni wrote:
>> 
>> > Hi Shaohua
>> >
>> > I encounter one panic recent in rhel7.6. The test is:
>> > 1. Create some VDO devices
>> > 2. Create raid10 device on 4 vdo devices
>> > 3. Reshape raid10 device to 6 vdo devices.
>> >
>> > When sector_nr <= last it needs to goto read_more. If the r10_bio
>> > containing the
>> > read_bio which is submitted before goto has freed and lower_barrier has
>> > called.
>> > It'll panic at BUG_ON(force && !conf->barrier)
>> >
>> > The possibility of this is decreased by c85ba1 (md: raid1/raid10: don't
>> > handle failure of bio_add_page())
>> > In the test case bio_add_page fails after adding one page. It usually calls
>> > goto read_more. So the
>> > problem happens easily.
>> >
>> > But in upstream it still has the possibility to hit the BUG_ON. Because the
>> > max_sectors return from
>> > read_balance can let sector_nr <= last.
>> >
>> > Do you think it's the right way to fix this?
>> 
>> No.
>> The right way to fix it is:
>> 
>> 1/ before the read_more: label, put
>>      raise_barrier(conf, 0);
>> 2/ in place of the existing raise_barrier() put
>>      raise_barrier(conf, 1);
>> 3/ after the "goto read_more" put
>>      lower_barrier(conf);
>
> Hi Neil
>
> Thanks for your reply. 
>
> The argument force of raise_barrier tries to submit those
> reshape bios as close as it can. It's the target. So the
> speed of reshape can be more quick, right?

Wrong.  The goal is to avoid deadlock.  If you hold a lock (which
raise_barrier effectively is), it isn't safe to block while waiting to
take it again - even if it is a reader-lock.

>
> Add raise_barrier(conf, 0) before read_more label, and call
> raise_barrier(conf, 1). The second call don't wait conf->nr_waiting
> to be zero. But the first call of raise_barrier(conf, 0), it still
> needs to wait for conf->nr_waiting to be zero. The time should be
> same if we call raise_barrier(conf, 0) directly in place of the
> existing raise_barrier().
>
> And is there a possibility it calls goto read_more many times? Fox example:
> there are two near copies. And they all have bad sectors in different
> places. If so it calls raise_barrier(conf, 0) many times and it just calls
> lower_barrier one time.

Yes, it could goto read_more many times.  Every time it calls
raise_barrier() and also schedules a request.  When the request
completes, lower_barrier is called.  So the raise_barrier() calls after
read_more exactly match the lower_barrier calls when the request
completes.
The new raise_barrier call I introduced balances the new
lower_barrier() call I introduced.  The purpose of these two is to keep
->barrier elevated until reshape_request can completed a set of
requests.

NeilBrown

>
> How about this way:
>
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 35bd3a6..49ca735 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -4531,11 +4531,12 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr,
>                 allow_barrier(conf);
>         }
>  
> +       raise_barrier(conf, 0);
> +
>  read_more:
>         /* Now schedule reads for blocks from sector_nr to last */
>         r10_bio = raid10_alloc_init_r10buf(conf);
>         r10_bio->state = 0;
> -       raise_barrier(conf, sectors_done != 0);
>         atomic_set(&r10_bio->remaining, 0);
>         r10_bio->mddev = mddev;
>         r10_bio->sector = sector_nr;
> @@ -4625,11 +4626,14 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr,
>         md_sync_acct_bio(read_bio, r10_bio->sectors);
>         atomic_inc(&r10_bio->remaining);
>         read_bio->bi_next = NULL;
> -       generic_make_request(read_bio);
>         sector_nr += nr_sectors;
>         sectors_done += nr_sectors;
> -       if (sector_nr <= last)
> +       if (sector_nr <= last) {
> +               raise_barrier(conf, 1);
> +               generic_make_request(read_bio);
>                 goto read_more;
> +       } else
> +               generic_make_request(read_bio);
>  
>         /* Now that we have done the whole section we can
>          * update reshape_progress
>
> Best Regards
> Xiao
Attachment:
signature.asc

Description: PGP signature