On 2017/10/17 下午8:17, Nate Dailey wrote: > If freeze_array is attempted in the middle of close_sync/ > wait_all_barriers, deadlock can occur. > > freeze_array will wait for nr_pending and nr_queued to line up. > wait_all_barriers increments nr_pending for each barrier bucket, one > at a time, but doesn't actually issue IO that could be counted in > nr_queued. So freeze_array is blocked until wait_all_barriers > completes and allow_all_barriers runs. At the same time, when > _wait_barrier sees array_frozen == 1, it stops and waits for > freeze_array to complete. > > Prevent the deadlock by making close_sync call _wait_barrier and > _allow_barrier for one bucket at a time, instead of deferring the > _allow_barrier calls until after all _wait_barriers are complete. > > Signed-off-by: Nate Dailey <nate.dailey@xxxxxxxxxxx> Reviewed-by: Coly Li <colyli@xxxxxxx> Thank you for the great effort on analyzing this deadlock, and provide a simpler and clean fix :-) Coly Li > --- > drivers/md/raid1.c | 24 ++++++------------------ > 1 file changed, 6 insertions(+), 18 deletions(-) > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > index f3f3e40dc9d8..e4e8f9e565b7 100644 > --- a/drivers/md/raid1.c > +++ b/drivers/md/raid1.c > @@ -990,14 +990,6 @@ static void wait_barrier(struct r1conf *conf, sector_t sector_nr) > _wait_barrier(conf, idx); > } > > -static void wait_all_barriers(struct r1conf *conf) > -{ > - int idx; > - > - for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++) > - _wait_barrier(conf, idx); > -} > - > static void _allow_barrier(struct r1conf *conf, int idx) > { > atomic_dec(&conf->nr_pending[idx]); > @@ -1011,14 +1003,6 @@ static void allow_barrier(struct r1conf *conf, sector_t sector_nr) > _allow_barrier(conf, idx); > } > > -static void allow_all_barriers(struct r1conf *conf) > -{ > - int idx; > - > - for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++) > - _allow_barrier(conf, idx); > -} > - > /* conf->resync_lock should be held */ > static int get_unqueued_pending(struct r1conf *conf) > { > @@ -1654,8 +1638,12 @@ static void print_conf(struct r1conf *conf) > > static void close_sync(struct r1conf *conf) > { > - wait_all_barriers(conf); > - allow_all_barriers(conf); > + int idx; > + > + for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++) { > + _wait_barrier(conf, idx); > + _allow_barrier(conf, idx); > + } > > mempool_destroy(conf->r1buf_pool); > conf->r1buf_pool = NULL; > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html