Hi,
在 2024/01/18 2:21, Mikulas Patocka 写道:
This commit fixes a deadlock in the LVM2 test
shell/lvconvert-raid-reshape-linear_to_raid6-single-type.sh
When MD_RECOVERY_WAIT is set or when md_is_rdwr(mddev) is true, the
function md_do_sync would not set MD_RECOVERY_DONE. Thus, stop_sync_thread
would wait for the flag MD_RECOVERY_DONE indefinitely.
Also, md_wakeup_thread_directly does nothing if the thread is waiting in
md_thread on thread->wqueue (it wakes the thread up, the thread would
check THREAD_WAKEUP and go to sleep again without doing anything). So,
this commit introduces a call to md_wakeup_thread from
md_wakeup_thread_directly.
task:lvm state:D stack:0 pid:46322 tgid:46322 ppid:46079 flags:0x00004002
Call Trace:
<TASK>
__schedule+0x228/0x570
schedule+0x29/0xa0
schedule_timeout+0x6a/0xd0
? timer_shutdown_sync+0x10/0x10
stop_sync_thread+0x197/0x1c0 [md_mod]
? housekeeping_test_cpu+0x30/0x30
? table_deps+0x1b0/0x1b0 [dm_mod]
__md_stop_writes+0x10/0xd0 [md_mod]
md_stop_writes+0x18/0x30 [md_mod]
raid_postsuspend+0x32/0x40 [dm_raid]
dm_table_postsuspend_targets+0x34/0x50 [dm_mod]
dm_suspend+0xc4/0xd0 [dm_mod]
dev_suspend+0x186/0x2d0 [dm_mod]
? table_deps+0x1b0/0x1b0 [dm_mod]
ctl_ioctl+0x2e1/0x570 [dm_mod]
dm_ctl_ioctl+0x5/0x10 [dm_mod]
__x64_sys_ioctl+0x85/0xa0
do_syscall_64+0x5d/0x1a0
entry_SYSCALL_64_after_hwframe+0x46/0x4e
Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>
Fixes: f52f5c71f3d4 ("md: fix stopping sync thread")
Cc: stable@xxxxxxxxxxxxxxx # v6.7
---
drivers/md/md.c | 8 +++++++-
drivers/md/raid5.c | 4 ++++
2 files changed, 11 insertions(+), 1 deletion(-)
Index: linux-2.6/drivers/md/md.c
===================================================================
--- linux-2.6.orig/drivers/md/md.c
+++ linux-2.6/drivers/md/md.c
@@ -8029,6 +8029,8 @@ static void md_wakeup_thread_directly(st
if (t)
wake_up_process(t->tsk);
rcu_read_unlock();
+
+ md_wakeup_thread(thread);
This is not correct. I already explained(already in comments) what
md_wakeup_thread_directly() is supposed to do.
}
void md_wakeup_thread(struct md_thread __rcu *thread)
@@ -8777,10 +8779,14 @@ void md_do_sync(struct md_thread *thread
/* just incase thread restarts... */
if (test_bit(MD_RECOVERY_DONE, &mddev->recovery) ||
- test_bit(MD_RECOVERY_WAIT, &mddev->recovery))
+ test_bit(MD_RECOVERY_WAIT, &mddev->recovery)) {
+ if (test_bit(MD_RECOVERY_INTR, &mddev->recovery))
+ set_bit(MD_RECOVERY_DONE, &mddev->recovery);
If you set MD_RECOVERY_DONE here, sync_thread will be unregistered, I
don't think this is the expected behaviour. Only dm-raid is using this
flag, and rs_start_reshape() already explains that it wants
sync_thread to work later until the table gets reloaded.
return;
+ }
if (!md_is_rdwr(mddev)) {/* never try to sync a read-only array */
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
+ set_bit(MD_RECOVERY_DONE, &mddev->recovery);
This change looks reasonable.
Thanks,
Kuai
return;
}
.