On Tue, 23 May 2023 21:39:00 +0800 Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > From: Yu Kuai <yukuai3@xxxxxxxxxx> > > The deadlock is described in [1], as the last patch described, it's > fixed first by [2], however this fix will be reverted and the deadlock > is supposed to be fixed by [3]. > > [1] > https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@xxxxxxxxxxxxx/T/#t > [2] > https://lore.kernel.org/linux-raid/20220621031129.24778-1-guoqing.jiang@xxxxxxxxx/ > [3] > https://lore.kernel.org/linux-raid/20230322064122.2384589-5-yukuai1@xxxxxxxxxxxxxxx/ > > Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx> > --- > tests/24raid456deadlock | 56 +++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 56 insertions(+) > create mode 100644 tests/24raid456deadlock > > diff --git a/tests/24raid456deadlock b/tests/24raid456deadlock > new file mode 100644 > index 00000000..161c3ab8 > --- /dev/null > +++ b/tests/24raid456deadlock > @@ -0,0 +1,56 @@ > +devs="$dev0 $dev1 $dev2 $dev3 $dev4 $dev5" > +runtime=120 > +pid="" > +old=`cat /proc/sys/vm/dirty_background_ratio` > + > +test_write_action() > +{ > + while true; do > + echo check > /sys/block/md0/md/sync_action &> /dev/null > + sleep 0.1 > + echo idle > /sys/block/md0/md/sync_action &> /dev/null > + done > +} > + > +test_write_back() > +{ > + fio -filename=$md0 -bs=4k -rw=write -numjobs=1 -name=test \ > + -time_based -runtime=$runtime &> /dev/null > +} > + > +set_up_test() > +{ > + fio -h &> /dev/null || die "fio not found" > + > + # create a simple raid6 > + mdadm -Cv -R -n 6 -l6 $md0 $devs --assume-clean || die "create raid6 > failed" + > + # trigger dirty pages write back > + echo 0 > /proc/sys/vm/dirty_background_ratio > +} > + > +clean_up_test() > +{ > + echo $old > /proc/sys/vm/dirty_background_ratio > + > + kill -9 $pid > + sync $md0 > + > + if ! mdadm -S $md0; then > + die "can't stop array, deadlock is probably triggered" > + fi Stop is problematic, I described why in previous patch. You can clean up array manually by ( I think you should to limit complexity): echo inactive > /sys/block/mdX/md/array_state echo clear > /sys/block/mdX/md/array_state Probably, one of those actions will hang right? The question is how we can catch it. I'm fine with current approach too: Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@xxxxxxxxxxxxxxx> Thanks, Mariusz