On Tue, 30 Jan 2024 20:55:39 -0800 Song Liu <song@xxxxxxxxxx> wrote: > On Tue, Jan 30, 2024 at 6:41 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> > > > > Can you test the following patch? > > > > diff --git a/drivers/md/md.c b/drivers/md/md.c > > index e3a56a958b47..a8db84c200fe 100644 > > --- a/drivers/md/md.c > > +++ b/drivers/md/md.c > > @@ -578,8 +578,12 @@ static void submit_flushes(struct work_struct > > *ws) rcu_read_lock(); > > } > > rcu_read_unlock(); > > - if (atomic_dec_and_test(&mddev->flush_pending)) > > + if (atomic_dec_and_test(&mddev->flush_pending)) { > > + /* The pair is percpu_ref_get() from > > md_flush_request() */ > > + percpu_ref_put(&mddev->active_io); > > + > > queue_work(md_wq, &mddev->flush_work); > > + } > > } > > > > static void md_submit_flush_data(struct work_struct *ws) > > This fixes the issue in my tests. Please submit the official patch. > Also, we should add a test in mdadm/tests to cover this case. > > Thanks, > Song > Hi Kuai, On my hardware issue also stopped reproducing with this fix. I applied the fix on current HEAD of master branch in kernel/git/torvalds/linux.git repo. Thansk, Blazej