duoming@xxxxxxxxxx writes: > Hello maintainers, > > Thank you for your time and suggestions! > >> > There are sleep in atomic context bugs when uploading device dump >> > data in mwifiex. The root cause is that dev_coredumpv could not >> > be used in atomic contexts, because it calls dev_set_name which >> > include operations that may sleep. The call tree shows execution >> > paths that could lead to bugs: >> > >> > (Interrupt context) >> > fw_dump_timer_fn >> > mwifiex_upload_device_dump >> > dev_coredumpv(..., GFP_KERNEL) >> > dev_coredumpm() >> > kzalloc(sizeof(*devcd), gfp); //may sleep >> > dev_set_name >> > kobject_set_name_vargs >> > kvasprintf_const(GFP_KERNEL, ...); //may sleep >> > kstrdup(s, GFP_KERNEL); //may sleep >> > >> > In order to let dev_coredumpv support atomic contexts, this patch >> > changes the gfp_t parameter of kvasprintf_const and kstrdup in >> > kobject_set_name_vargs from GFP_KERNEL to GFP_ATOMIC. What's more, >> > In order to mitigate bug, this patch changes the gfp_t parameter >> > of dev_coredumpv from GFP_KERNEL to GFP_ATOMIC. >> >> vmalloc in atomic context? >> >> Not only does dev_coredumpm set a device name dev_coredumpm creates an >> entire device to hold the device dump. >> >> My sense is that either dev_coredumpm needs to be rebuilt on a >> completely different principle that does not need a device to hold the >> coredump (so that it can be called from interrupt context) or that >> dev_coredumpm should never be called in an context that can not sleep. > > The following solution removes the gfp_t parameter of dev_coredumpv(), > dev_coredumpm() and dev_coredumpsg() and change the gfp_t parameter of > kzalloc() in dev_coredumpm() to GFP_KERNEL, in order to show that these > functions can not be used in atomic context. > > What's more, I move the operations that may sleep into a work item and use > schedule_work() to call a kernel thread to do the operations that may sleep. > [...] > --- a/drivers/net/wireless/marvell/mwifiex/init.c > +++ b/drivers/net/wireless/marvell/mwifiex/init.c > @@ -63,11 +63,19 @@ static void wakeup_timer_fn(struct timer_list *t) > adapter->if_ops.card_reset(adapter); > } > > +static void fw_dump_work(struct work_struct *work) > +{ > + struct mwifiex_adapter *adapter = > + container_of(work, struct mwifiex_adapter, devdump_work); > + > + mwifiex_upload_device_dump(adapter); > +} > + > static void fw_dump_timer_fn(struct timer_list *t) > { > struct mwifiex_adapter *adapter = from_timer(adapter, t, devdump_timer); > > - mwifiex_upload_device_dump(adapter); > + schedule_work(&adapter->devdump_work); > } > > /* > @@ -321,6 +329,7 @@ static void mwifiex_init_adapter(struct mwifiex_adapter *adapter) > adapter->active_scan_triggered = false; > timer_setup(&adapter->wakeup_timer, wakeup_timer_fn, 0); > adapter->devdump_len = 0; > + INIT_WORK(&adapter->devdump_work, fw_dump_work); > timer_setup(&adapter->devdump_timer, fw_dump_timer_fn, 0); > } > > @@ -401,6 +410,7 @@ mwifiex_adapter_cleanup(struct mwifiex_adapter *adapter) > { > del_timer(&adapter->wakeup_timer); > del_timer_sync(&adapter->devdump_timer); > + cancel_work_sync(&adapter->devdump_work); > mwifiex_cancel_all_pending_cmd(adapter); > wake_up_interruptible(&adapter->cmd_wait_q.wait); > wake_up_interruptible(&adapter->hs_activate_wait_q); In this patch please only do the API change in mwifiex. The change to using a workqueue needs to be in separate patch so it can be properly tested. I don't want a change like that going to the kernel without testing on a real device. -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches