Summary: The introduction of async reboot in commit 8064952c6504 ("driver core: shut down devices asynchronously") leads to frequent hangs on shutdown even after commit 4f2c346e6216 ("driver core: fix async device shutdown hang") is introduced. I did some further experimenting (and lots of reboots ...) and found out that the bug is preemption related, for me it only occurs when using CONFIG_PREEMPT=y or CONFIG_PREEMPT_RT=y. When using CONFIG_PREEMPT_NONE=y or CONFIG_PREEMPT_VOLUNTARY=y everything works fine. Test results (linux-next-20240925): PREEMPT_NONE 20 reboots, no fail PREEMPT_VOLUNTARY 20 reboots, no fail PREEMPT 3 reboots, 4th reboot failed PREEMPT_RT 2 reboots, 3rd reboot failed The behaviour can be improved by increasing the number of min_active items in the async workqueue: diff --git a/kernel/async.c b/kernel/async.c index 4c3e6a44595f..83e9267c61e7 100644 --- a/kernel/async.c +++ b/kernel/async.c @@ -358,5 +358,5 @@ void __init async_init(void) */ async_wq = alloc_workqueue("async", WQ_UNBOUND, 0); BUG_ON(!async_wq); - workqueue_set_min_active(async_wq, WQ_DFL_ACTIVE); + workqueue_set_min_active(async_wq, WQ_UNBOUND_MAX_ACTIVE); } With this I took 11 reboots to get a hang. I tried increasing WQ_MAX_ACTIVE, too: diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 59c2695e12e7..314f554b45df 100644 --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -412,7 +412,7 @@ enum wq_flags { }; enum wq_consts { - WQ_MAX_ACTIVE = 512, /* I like 512, better ideas? */ + WQ_MAX_ACTIVE = 1024, /* 1024 for async shutdown with preempt{full,rt}*/ WQ_UNBOUND_MAX_ACTIVE = WQ_MAX_ACTIVE, WQ_DFL_ACTIVE = WQ_MAX_ACTIVE / 2, With this (and the first patch) I can get 20 clean reboots even when using CONFIG_PREEMPT=y. I have not yet tested CONFIG_PREEMPT_RT=y with this. Edit: Fixed In-Reply-To: Bert Karwatzki