Hi Greg, On 5/5/23 22:52, Greg KH wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > > On Fri, May 05, 2023 at 11:42:51PM +0000, Ajay.Kathat@xxxxxxxxxxxxx wrote: >> Fix for kernel crash observed with following test procedure: >> while true; >> do ifconfig wlan0 up; >> iw dev wlan0 scan & >> ifconfig wlan0 down; >> done >> >> During the above test procedure, the scan results are received from firmware >> for 'iw scan' command gets queued even when the interface is going down. It >> was causing the kernel oops when dereferencing the freed pointers. >> >> For synchronization, 'mac_close()' calls flush_workqueue() to block its >> execution till all pending work is completed. Afterwards 'wilc->close' flag >> which is set before the flush_workqueue() should avoid adding new work. >> Added 'wilc->close' check in wilc_handle_isr() which is common for >> SPI/SDIO bus to ignore the interrupts from firmware that inturns adds the >> work since the interface is getting closed. >> >> Also, removed isr_uh_routine() as it's not necessary after 'wl->close' check >> is added in wilc_handle_isr(). So now the default primary handler would be >> used for threaded IRQ. >> >> Cc: stable@xxxxxxxxxxxxxxx >> Reported-by: Michael Walle <mwalle@xxxxxxxxxx> >> Link: https://lore.kernel.org/linux-wireless/20221024135407.7udo3dwl3mqyv2yj@xxxxxxxxxxxx/ >> Signed-off-by: Ajay Singh <ajay.kathat@xxxxxxxxxxxxx> >> --- >> changes since v1: >> - updated commit description and included 'Link:' tag >> - use atomic_t type for 'close' variable >> - set close state after clearing ongoing scan operation >> - make use of default primary handler for threaded_irq >> - avoid false failure debug message during mac_close >> >> .../wireless/microchip/wilc1000/cfg80211.c | 2 +- >> drivers/net/wireless/microchip/wilc1000/hif.c | 2 +- >> .../net/wireless/microchip/wilc1000/netdev.c | 33 ++++++------------- >> .../net/wireless/microchip/wilc1000/netdev.h | 2 +- >> .../net/wireless/microchip/wilc1000/wlan.c | 3 ++ >> 5 files changed, 16 insertions(+), 26 deletions(-) >> >> diff --git a/drivers/net/wireless/microchip/wilc1000/cfg80211.c b/drivers/net/wireless/microchip/wilc1000/cfg80211.c >> index b545d93c6e37..a90a75094486 100644 >> --- a/drivers/net/wireless/microchip/wilc1000/cfg80211.c >> +++ b/drivers/net/wireless/microchip/wilc1000/cfg80211.c >> @@ -461,7 +461,7 @@ static int disconnect(struct wiphy *wiphy, struct net_device *dev, >> if (!wilc) >> return -EIO; >> >> - if (wilc->close) { >> + if (atomic_read(&wilc->close)) { > > What happens if this changes right after you read from this? > Yeah, there is a possible race condition between 'cfg80211_ops->disconnect()' and 'ndo_stop()'. The above check is not enough to handle that scenario. For that, I thought of using a lock between 'wifi disconnect' and 'interface close' to serialize the access and submit it in a separate patch(maybe I can add a patch for that as patch series). However, this patch helps to resolve the issue which was reported in the test procedure. By setting the 'close' state correctly(i.e before flushing the workqueue) the new work doesn't get added for processing when the interface is going down. > Don't reimplement locks on your own, use a real one please. Sure, I will work on updating the patch using a lock between 'disconnect()' & 'ndo_stop()'. Regards, Ajay