On Mon, May 9, 2022 at 1:09 AM Guoqing Jiang <guoqing.jiang@xxxxxxxxx> wrote: > > > > On 5/9/22 2:37 PM, Song Liu wrote: > > On Fri, May 6, 2022 at 4:37 AM Guoqing Jiang<guoqing.jiang@xxxxxxxxx> wrote: > >> From: Guoqing Jiang<guoqing.jiang@xxxxxxxxxxxxxxx> > >> > >> Unregister sync_thread doesn't need to hold reconfig_mutex since it > >> doesn't reconfigure array. > >> > >> And it could cause deadlock problem for raid5 as follows: > >> > >> 1. process A tried to reap sync thread with reconfig_mutex held after echo > >> idle to sync_action. > >> 2. raid5 sync thread was blocked if there were too many active stripes. > >> 3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer) > >> which causes the number of active stripes can't be decreased. > >> 4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able > >> to hold reconfig_mutex. > >> > >> More details in the link: > >> https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@xxxxxxxxxxxxx/T/#t > >> > >> Let's call unregister thread between mddev_unlock and mddev_lock_nointr > >> (thanks for the report from kernel test robot<lkp@xxxxxxxxx>) if the > >> reconfig_mutex is held, and mddev_is_locked is introduced accordingly. > > mddev_is_locked() feels really hacky to me. It cannot tell whether > > mddev is locked > > by current thread. So technically, we can unlock reconfigure_mutex for > > other thread > > by accident, no? > > I can switch back to V2 if you think that is the correct way to do though no > one comment about the change in dm-raid. I guess v2 is the best at the moment. I pushed a slightly modified v2 to md-next. Thanks for working on this. Song