On Tue, 12 Sep 2023 21:25:24 +0800 Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > Hi, > > 在 2023/09/12 21:01, Mariusz Tkaczyk 写道: > > During working on changes proposed by Kuai [1], I determined that > > mddev->active is continusly decremented for array marked by MD_CLOSING. > > It brought me to md_seq_next() changed by [2]. I determined the regression > > here, if mddev_get() fails we updated mddev pointer and as a result we > > _put failed device. > > This mddev is decremented while there is another mddev increased, that's > why AceLan said that single array can't reporduce the problem. > > And because mddev->active is leaked, then del_gendisk() will never be > called for the mddev while closing the array, that's why user will > always see this array, cause infiniate loop open -> stop array -> close > for systemd-shutdown. Ohh, I see the scenario now... First array can be successfully stopped. We marked MD_DELETED and proceed with scheduling wq but in the middle of that md_seq_next() increased active for other array and decrement active for the one with MD_DELETED. For this next array we are unable to reach active == 0 anymore. Song, let me know if you need description like that in commit message. Thanks! Mariusz