Hello, On Fri, 20 May 2022 09:08:52 -0700 Jeff Johnson wrote: > >>>>> There are sleep in atomic context bugs when uploading device dump > >>>>> data on usb interface. The root cause is that the operations that > >>>>> may sleep are called in fw_dump_timer_fn which is a timer handler. > >>>>> The call tree shows the execution paths that could lead to bugs: > >>>>> > >>>>> (Interrupt context) > >>>>> fw_dump_timer_fn > >>>>> mwifiex_upload_device_dump > >>>>> dev_coredumpv(..., GFP_KERNEL) > >> > >> just looking at this description, why isn't the simple fix just to > >> change this call to use GFP_ATOMIC? > > > > Because change the parameter of dev_coredumpv() to GFP_ATOMIC could only solve > > partial problem. The following GFP_KERNEL parameters are in /lib/kobject.c > > which is not influenced by dev_coredumpv(). > > > > kobject_set_name_vargs > > kvasprintf_const(GFP_KERNEL, ...); //may sleep > > kstrdup(s, GFP_KERNEL); //may sleep > > Then it seems there is a problem with dev_coredumpm(). > > dev_coredumpm() takes a gfp param which means it expects to be called in > any context, but it then calls dev_set_name() which, as you point out, > cannot be called from an atomic context. > > So if we cannot change the fact that dev_set_name() cannot be called > from an atomic context, then it would seem to follow that > dev_coredumpv also cannot be called from an atomic > context and hence their gfp param is pointless and should presumably be > removed. Thanks for your time and suggestions! I think the gfp_t parameter of dev_coredumpv and dev_coredumpm may not be removed, because it could be used to pass value to gfp_t parameter of kzalloc in dev_coredumpm. What's more, there are also many other places use dev_coredumpv and dev_coredumpm, if we remove the gfp_t parameter, there are too many places that need to modify and these places are not in interrupt context. There are two solutions now: one is to moves the operations that may sleep into a work item. Another is to change the gfp_t parameter of dev_coredumpv from GFP_KERNEL to GFP_ATOMIC, and change the gfp_t parameter of kvasprintf_const and kstrdup from GFP_KERNEL to "in_interrupt() ? GFP_ATOMIC : GFP_KERNEL". The detail of the first solution is shown below: diff --git a/drivers/net/wireless/marvell/mwifiex/init.c b/drivers/net/wireless/marvell/mwifiex/init.c index 88c72d1827a..cc3f1121eb9 100644 --- a/drivers/net/wireless/marvell/mwifiex/init.c +++ b/drivers/net/wireless/marvell/mwifiex/init.c @@ -63,11 +63,19 @@ static void wakeup_timer_fn(struct timer_list *t) adapter->if_ops.card_reset(adapter); } +static void fw_dump_work(struct work_struct *work) +{ + struct mwifiex_adapter *adapter = + container_of(work, struct mwifiex_adapter, devdump_work); + + mwifiex_upload_device_dump(adapter); +} + static void fw_dump_timer_fn(struct timer_list *t) { struct mwifiex_adapter *adapter = from_timer(adapter, t, devdump_timer); - mwifiex_upload_device_dump(adapter); + schedule_work(&adapter->devdump_work); } /* @@ -321,6 +329,7 @@ static void mwifiex_init_adapter(struct mwifiex_adapter *adapter) adapter->active_scan_triggered = false; timer_setup(&adapter->wakeup_timer, wakeup_timer_fn, 0); adapter->devdump_len = 0; + INIT_WORK(&adapter->devdump_work, fw_dump_work); timer_setup(&adapter->devdump_timer, fw_dump_timer_fn, 0); } @@ -401,6 +410,7 @@ mwifiex_adapter_cleanup(struct mwifiex_adapter *adapter) { del_timer(&adapter->wakeup_timer); del_timer_sync(&adapter->devdump_timer); + cancel_work_sync(&adapter->devdump_work); mwifiex_cancel_all_pending_cmd(adapter); wake_up_interruptible(&adapter->cmd_wait_q.wait); wake_up_interruptible(&adapter->hs_activate_wait_q); diff --git a/drivers/net/wireless/marvell/mwifiex/main.h b/drivers/net/wireless/marvell/mwifiex/main.h index 332dd1c8db3..c8ac2f57f18 100644 --- a/drivers/net/wireless/marvell/mwifiex/main.h +++ b/drivers/net/wireless/marvell/mwifiex/main.h @@ -900,6 +900,7 @@ struct mwifiex_adapter { struct work_struct rx_work; struct workqueue_struct *dfs_workqueue; struct work_struct dfs_work; + struct work_struct devdump_work; bool rx_work_enabled; bool rx_processing; bool delay_main_work; diff --git a/drivers/net/wireless/marvell/mwifiex/sta_event.c b/drivers/net/wireless/marvell/mwifiex/sta_event.c index 7d42c5d2dbf..8e28d0107d7 100644 --- a/drivers/net/wireless/marvell/mwifiex/sta_event.c +++ b/drivers/net/wireless/marvell/mwifiex/sta_event.c @@ -644,6 +644,7 @@ mwifiex_fw_dump_info_event(struct mwifiex_private *priv, upload_dump: del_timer_sync(&adapter->devdump_timer); + cancel_work_sync(&adapter->devdump_work); mwifiex_upload_device_dump(adapter); } The detail of the second solution is shown below: diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c index ace7371c477..258906920a2 100644 --- a/drivers/net/wireless/marvell/mwifiex/main.c +++ b/drivers/net/wireless/marvell/mwifiex/main.c @@ -1116,7 +1116,7 @@ void mwifiex_upload_device_dump(struct mwifiex_adapter *adapter) mwifiex_dbg(adapter, MSG, "== mwifiex dump information to /sys/class/devcoredump start\n"); dev_coredumpv(adapter->dev, adapter->devdump_data, adapter->devdump_len, - GFP_KERNEL); + GFP_ATOMIC); mwifiex_dbg(adapter, MSG, "== mwifiex dump information to /sys/class/devcoredump end\n"); diff --git a/lib/kobject.c b/lib/kobject.c index 5f0e71ab292..dbb2e57ef78 100644 --- a/lib/kobject.c +++ b/lib/kobject.c @@ -254,7 +254,7 @@ int kobject_set_name_vargs(struct kobject *kobj, const char *fmt, if (kobj->name && !fmt) return 0; - s = kvasprintf_const(GFP_KERNEL, fmt, vargs); + s = kvasprintf_const(in_interrupt() ? GFP_ATOMIC : GFP_KERNEL, fmt, vargs); if (!s) return -ENOMEM; @@ -267,7 +267,7 @@ int kobject_set_name_vargs(struct kobject *kobj, const char *fmt, if (strchr(s, '/')) { char *t; - t = kstrdup(s, GFP_KERNEL); + t = kstrdup(s, in_interrupt() ? GFP_ATOMIC : GFP_KERNEL); kfree_const(s); if (!t) return -ENOMEM; Which one do you think is better? I will choose the better one to test. Best regards, Duoming Zhou