Panic at BUG_ON(force && !conf->barrier);

Xiao Ni <xni@xxxxxxxxxx> · Thu, 16 Aug 2018 21:03:10 -0400 (EDT)

Hi Shaohua

I encounter one panic recent in rhel7.6. The test is:
1. Create some VDO devices
2. Create raid10 device on 4 vdo devices
3. Reshape raid10 device to 6 vdo devices. 

When sector_nr <= last it needs to goto read_more. If the r10_bio containing the
read_bio which is submitted before goto has freed and lower_barrier has called. 
It'll panic at BUG_ON(force && !conf->barrier)

The possibility of this is decreased by c85ba1 (md: raid1/raid10: don't handle failure of bio_add_page())
In the test case bio_add_page fails after adding one page. It usually calls goto read_more. So the
problem happens easily. 

But in upstream it still has the possibility to hit the BUG_ON. Because the max_sectors return from
read_balance can let sector_nr <= last. 

Do you think it's the right way to fix this?

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 35bd3a6..f6de031 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -4535,7 +4535,7 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr,
        /* Now schedule reads for blocks from sector_nr to last */
        r10_bio = raid10_alloc_init_r10buf(conf);
        r10_bio->state = 0;
-       raise_barrier(conf, sectors_done != 0);
+       raise_barrier(conf, 0);
        atomic_set(&r10_bio->remaining, 0);
        r10_bio->mddev = mddev;
        r10_bio->sector = sector_nr;

Best Regards
Xiao