On 5/10/22 8:44 AM, Song Liu wrote:
On Mon, May 9, 2022 at 1:09 AM Guoqing Jiang <guoqing.jiang@xxxxxxxxx> wrote:
On 5/9/22 2:37 PM, Song Liu wrote:
On Fri, May 6, 2022 at 4:37 AM Guoqing Jiang<guoqing.jiang@xxxxxxxxx> wrote:
From: Guoqing Jiang<guoqing.jiang@xxxxxxxxxxxxxxx>
Unregister sync_thread doesn't need to hold reconfig_mutex since it
doesn't reconfigure array.
And it could cause deadlock problem for raid5 as follows:
1. process A tried to reap sync thread with reconfig_mutex held after echo
idle to sync_action.
2. raid5 sync thread was blocked if there were too many active stripes.
3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer)
which causes the number of active stripes can't be decreased.
4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able
to hold reconfig_mutex.
More details in the link:
https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@xxxxxxxxxxxxx/T/#t
Let's call unregister thread between mddev_unlock and mddev_lock_nointr
(thanks for the report from kernel test robot<lkp@xxxxxxxxx>) if the
reconfig_mutex is held, and mddev_is_locked is introduced accordingly.
mddev_is_locked() feels really hacky to me. It cannot tell whether
mddev is locked
by current thread. So technically, we can unlock reconfigure_mutex for
other thread
by accident, no?
I can switch back to V2 if you think that is the correct way to do though no
one comment about the change in dm-raid.
I guess v2 is the best at the moment. I pushed a slightly modified v2 to
md-next.
I think, this can be used to get a double-free from md_unregister_thread.
Please review
https://lore.kernel.org/linux-raid/8312a154-14fb-6f07-0cf1-8c970187cc49@xxxxxxxxxxxxx/
Best
Donald
Thanks for working on this.
Song
--
Donald Buczek
buczek@xxxxxxxxxxxxx
Tel: +49 30 8413 1433