On 22/10/26 10:54, Michael Walle wrote: > Hi Ajay, > > On 22/10/25 08:26, Ajay.Kathat@xxxxxxxxxxxxx wrote: > > > In handle_rcvd_ntwrk_info() scan_req->scan_result isn't valid anymore, > > > although it doesn't contain NULL. Thus the driver is calling into a > > > bogus function pointer. There seems to be no locking between the > > > asynchronous calls within the workqueue (wilc_enqueue_work()) and when > > > the interface is disabled (wilc_deinit()). wilc_deinit() will free the > > > host_if_drv object which might still be used within the workqueue > > > context. > > > > > > Please try the below code changes with your test setup environment. > > The workqueue handling and wilc_deinit() run in parallel, correct? .. > > > --- a/drivers/net/wireless/microchip/wilc1000/hif.c > > +++ b/drivers/net/wireless/microchip/wilc1000/hif.c > > @@ -495,12 +495,18 @@ static void handle_rcvd_ntwrk_info(struct > > work_struct *work) > > { > > struct host_if_msg *msg = container_of(work, struct > > host_if_msg, work); > > struct wilc_rcvd_net_info *rcvd_info = &msg->body.net_info; > > - struct wilc_user_scan_req *scan_req = > > &msg->vif->hif_drv->usr_scan_req; > > + struct host_if_drv *hif_drv = msg->vif->hif_drv; > > + struct wilc_user_scan_req *scan_req; > > const u8 *ch_elm; > > u8 *ies; > > int ies_len; > > size_t offset; > > > > + if (!hif_drv || !hif_drv->usr_scan_req.scan_result) > > + goto done; > > .. So what if hif_drv will be set to NULL right after this check? > > I don't think you'll get around using proper locking here. > > > + > > + scan_req = &hif_drv->usr_scan_req; > > + > > if (ieee80211_is_probe_resp(rcvd_info->mgmt->frame_control)) > > offset = offsetof(struct ieee80211_mgmt, > > u.probe_resp.variable); > > else if (ieee80211_is_beacon(rcvd_info->mgmt->frame_control)) > > @@ -1574,6 +1580,9 @@ void wilc_network_info_received(struct wilc *wilc, > > u8 *buffer, u32 length) > > return; > > } > > > > + if (!hif_drv->usr_scan_req.scan_result) > > + return; > > + > > This is also racy. What if scan_result is cleared right after this > check? Then the work item will still get added to the work queue. > > Here is the call tree: > > isr_bh_routine() [interrupt thread ctx] > wilc_handle_isr() > mutex_lock(hif_cs) > wilc_wlan_handle_isr_ext() > wilc_wlan_handle_rxq() > wilc_wlan_handle_rx_buff() > wilc_wlan_cfg_indicate_rx() > wilc_network_info_received() > wilc_enqueue_work(handle_rcvd_ntwrk_info) > mutex_unlock(hif_cs) > > handle_rcvd_ntwrk_info [hif_workqueue ctx] > if (scan_result) > scan_result() > > wilc_mac_close() [ioctl ctx?] > wilc_deinit_host_int() > wilc_deinit() > mutex_lock(deinit_lock) > if (scan_result) > scan_result() > scan_result = NULL > kfree(hif_drv) > hif_drv = NULL > mutex_unlock(deinit_lock) > > I don't see any synchronization mechanisms, between these threads. > > > msg = wilc_alloc_work(vif, handle_rcvd_ntwrk_info, false); > > if (IS_ERR(msg)) > > return; > > > > The above changes should avoid the kernel crash exception. > > As mentioned above, I think this will just decrease the chance that > it is happening. Nonetheless, I've tried your changes but it doesn't > fix the crash. Any news here? Thanks, -michael > > > BTW, ignore the "FW not repsonding" for now, that seems to be a > > > different problem. > > > > "FW not responding" log indicates the chip sleep command failure from > > Host to the FW. It's a temporary failure log for specific command. > > During the de-init process, this logs is often observed. IIRC, there was > > a change in the latest driver that reduced its frequency but I am unable > > to recall the exact change. > > So what is a "temporary failure"? It is a pr_warn() and customers > (rightfully) complains about them. Apart from that, it only happens on > the second "ifconfig wlan0 up". There are no messages during the first > one. So I suspect there is still something fishy. > > -michael