On Fri, 24 Aug 2018 11:46:39 +0300 Claudiu Beznea <Claudiu.Beznea@xxxxxxxxxxxxx> wrote: > On 23.08.2018 17:36, Ajay Singh wrote: > > On Thu, 23 Aug 2018 11:11:18 +0300 > > Claudiu Beznea <Claudiu.Beznea@xxxxxxxxxxxxx> wrote: > > > >> On 14.08.2018 09:50, Ajay Singh wrote: > >>> Remove the use of static variable 'terminated_handle' and instead > >>> move in wilc_vif struct. > >>> After moving this variable to wilc_vif struct its not required to > >>> keep 'terminated_handle', so changed it to boolean type. > >> > >> You can remove it at all and use wilc->hif_deinit_lock mutex also > >> in wilc_scan_complete_received() and wilc_network_info_received() > >> it is used in wilc_gnrl_async_info_received(). > > > > In my understanding, 'terminated_handle' is used to know the > > status when interface is getting deinit(). During deinitialization > > of an interface if any async event received for the interface(from > > firmware) should be ignored. > > 'terminated_handle' true only inside mutex. So, outside of it it will > be false, so *mostly* it will tell you when mutex is locked for > deinit. *Mostly*, because context switches may happen while a mutex > is locked. > Yes, outside the mutex 'terminated_handle' would be false. I already agreed that while fixing the bug using 'terminated_handle' all the scenarios are not handled but as you suggested before to remove 'terminated_handle' and only use 'mutex' will also not help to address all corner scenarios. So, I suggest to keep the current status by making use of this flag and handle all scenario in later patch to de-initialization graceful. > With the current approach you have this code: > > int wilc_deinit(struct wilc_vif *vif) > { > // ... > mutex_lock(&vif->wilc->hif_deinit_lock); > > // (A) > > vif->is_termination_progress = true; > // ... > vif->is_termination_progress = false; > > mutex_unlokc(&vif->wilc->hif_deinit_lock); > } > > And: > > void wilc_network_info_received(struct wilc *wilc, u8 *buffer, u32 > length) { > // ... > if (!hif_drv || vif->is_termination_progress) { > netdev_err(vif->ndev, "driver not init[%p]\n", > hif_drv); return; > } > > // ... > > // (B) > result = wilc_enqueue_work(msg); > // ... > } > > And: > > static int wilc_enqueue_work(struct host_if_msg *msg) > > { > > INIT_WORK(&msg->work, msg->fn); > > > > if (!msg->vif || !msg->vif->wilc > || !msg->vif->wilc->hif_workqueue) > > return -EINVAL; > > > // (C) > if (!queue_work(msg->vif->wilc->hif_workqueue, &msg->work)) > > return -EINVAL; > > > > return 0; > > } > > > You may have the following scenario: > 1. context switch in wilc_deinit() just after the mutex is taken > (point A above). vif->is_termination_progress = false at this point. > > 2. a new packet is received and wilc_network_info_received() gets > executed and execution reaches point B, goes to wilc_enqueue_work() > and suspend at point C then context switch. > Thanks for explaining the scenario with an example. For the given scenario, I think not only here it can even suspend after posting the work queue and start the execution of handler function eg. handle_recvd_gnrl_async_info()(there is no protection in handle function) There are some combination of these scenario, I will relook into these cases and check how to handle them separately. > 3. wilc_deinit() resumes and finish its execution. > > 4. wilc_enqueue_work() resumes and queue_work() is executed but you > already freed the hif_workqueue. You will have a crash here. > > Notice that hif_drv is not set to NULL on wilc_deinit() after it is > kfree(). > > > > > In my opinion its not right to only wait for the mutex in any of > > callback e.g wilc_scan_complete_received() because it will delay the > > handling of callback and try to process the event once lock is > > available for the interface which is already de-initialized. > > But this is already done for wilc_gnrl_async_info_received(). > > > > > Based on my understand 'mutex' alone is not enough to > > handle this and we some extra check to know if interface is down. > > terminated_handle will *mostly* tell you when the mutex is locked, > nothing more. > > I will > > check more about this patch how to handle the extra scenario for > > this case. > > Please suggest if someone has better idea on how to handle this. > >