> On Mar 3, 2020, at 1:14 PM, Vitaly Mayatskikh <vmayatskikh@xxxxxxxxxxxxxxxx> wrote: > > When disk failure happens and the array has a spare drive, resync thread > kicks in and starts to refill the spare. However it may get blocked by > a retry thread that resubmits failed IO to a mirror and itself can get > blocked on a barrier raised by the resync thread. > > Signed-off-by: Vitaly Mayatskikh <vmayatskikh@xxxxxxxxxxxxxxxx> > --- > drivers/md/raid10.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index ec136e4..f1a8e26 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c > @@ -980,6 +980,7 @@ static void wait_barrier(struct r10conf *conf) > { > spin_lock_irq(&conf->resync_lock); > if (conf->barrier) { > + struct bio_list *bio_list = current->bio_list; > conf->nr_waiting++; > /* Wait for the barrier to drop. > * However if there are already pending > @@ -994,9 +995,16 @@ static void wait_barrier(struct r10conf *conf) > wait_event_lock_irq(conf->wait_barrier, > !conf->barrier || > (atomic_read(&conf->nr_pending) && > - current->bio_list && > - (!bio_list_empty(¤t->bio_list[0]) || > - !bio_list_empty(¤t->bio_list[1]))), > + bio_list && > + (!bio_list_empty(&bio_list[0]) || > + !bio_list_empty(&bio_list[1]))) || > + /* move on if recovery thread is > + * blocked by us > + */ > + (conf->mddev->thread->tsk == current && > + test_bit(MD_RECOVERY_RUNNING, > + &conf->mddev->recovery) && > + conf->nr_queued > 0), > conf->resync_lock); > conf->nr_waiting--; > if (!conf->nr_waiting) > — > 1.8.3.1 > Song, Have you had a chance to look at this patch? We would like to have it pulled in to the kernel. -Nigel