On Thu, Apr 23, 2020 at 5:38 AM Nigel Croxon <ncroxon@xxxxxxxxxx> wrote: > > > On 1/28/20 2:08 PM, Song Liu wrote: > > On Mon, Jan 27, 2020 at 11:56 AM David Jeffery <djeffery@xxxxxxxxxx> wrote: > >> On Mon, Jan 27, 2020 at 12:29 PM Song Liu <song@xxxxxxxxxx> wrote: > >>> On Mon, Jan 27, 2020 at 7:26 AM David Jeffery <djeffery@xxxxxxxxxx> wrote: > >>>> When using RAID1 and write-behind, md can deadlock when errors occur. With > >>>> write-behind, r1bio structs can be accounted by raid1 as queued but not > >>>> counted as pending. The pending count is dropped when the original bio is > >>>> returned complete but write-behind for the r1bio may still be active. > >>>> > >>>> This breaks the accounting used in some conditions to know when the raid1 > >>>> md device has reached an idle state. It can result in calls to > >>>> freeze_array deadlocking. freeze_array will never complete from a negative > >>>> "unqueued" value being calculated due to a queued count larger than the > >>>> pending count. > >>>> > >>>> To properly account for write-behind, move the call to allow_barrier from > >>>> call_bio_endio to raid_end_bio_io. When using write-behind, md can call > >>>> call_bio_endio before all write-behind I/O is complete. Using > >>>> raid_end_bio_io for the point to call allow_barrier will release the > >>>> pending count at a point where all I/O for an r1bio, even write-behind, is > >>>> done. > >>>> > >>>> Signed-off-by: David Jeffery <djeffery@xxxxxxxxxx> > >>>> --- > >>>> > >>>> raid1.c | 13 +++++++------ > >>>> 1 file changed, 7 insertions(+), 6 deletions(-) > >>>> > >>>> > >>>> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > >>>> index 201fd8aec59a..0196a9d9f7e9 100644 > >>>> --- a/drivers/md/raid1.c > >>>> +++ b/drivers/md/raid1.c > >>>> @@ -279,22 +279,17 @@ static void reschedule_retry(struct r1bio *r1_bio) > >>>> static void call_bio_endio(struct r1bio *r1_bio) > >>>> { > >>>> struct bio *bio = r1_bio->master_bio; > >>>> - struct r1conf *conf = r1_bio->mddev->private; > >>>> > >>>> if (!test_bit(R1BIO_Uptodate, &r1_bio->state)) > >>>> bio->bi_status = BLK_STS_IOERR; > >>>> > >>>> bio_endio(bio); > >>>> - /* > >>>> - * Wake up any possible resync thread that waits for the device > >>>> - * to go idle. > >>>> - */ > >>>> - allow_barrier(conf, r1_bio->sector); > >>> raid1_end_write_request() also calls call_bio_endio(). Do we need to fix > >>> that? > >> This basically is the problem the patch is fixing. We don't want > >> allow_barrier() being called when call_bio_endio() is called directly > >> from raid1_end_write_request(). Write-behind can still be active here > >> so it was dropping the pending accounting too early. We only want it > >> called when all I/O for the r1bio is complete, which shifting the > >> allow_barrier() call to raid_end_bio_io() does. > > Thanks for the explanation. This looks good to me. I will process it > > after the merge window. > > > > I will also re-evaluate whether we need it for stable. > > > > Thanks again, > > Song > > > Hello Song, > > Did you pull in this patch after the merge window? > > I don't see it in your tree. I am really sorry I missed this one. Just applied it to md-next. Thanks, Song