Search Linux Wireless

RE: [BUG v6.2.7] Hitting BUG_ON() on rtw89 wireless driver startup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Jonas Gorski <jonas.gorski@xxxxxxxxx>
> Sent: Thursday, March 23, 2023 4:52 AM
> To: Larry Finger <Larry.Finger@xxxxxxxxxxxx>
> Cc: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; linux-wireless@xxxxxxxxxxxxxxx; Ping-Ke
> Shih <pkshih@xxxxxxxxxxx>
> Subject: Re: [BUG v6.2.7] Hitting BUG_ON() on rtw89 wireless driver startup
> 
> On Wed, 22 Mar 2023 at 18:03, Larry Finger <Larry.Finger@xxxxxxxxxxxx> wrote:
> >
> > On 3/22/23 10:54, Hyeonggon Yoo wrote:
> > >
> > > Hello folks,
> > > I've just encountered weird bug when booting Linux v6.2.7
> > >
> > > config: attached
> > > dmesg: attached
> > >
> > > I'm not sure exactly how to trigger this issue yet because it's not
> > > stably reproducible. (just have encountered randomly when logging in)
> > >
> > > At quick look it seems to be related to rtw89 wireless driver or network subsystem.
> >
> > Your bug is weird indeed, and it does come from rtw89_8852be. My distro has not
> > yet released kernel 6.2.7, but I have not seen this problem with mainline
> > kernels throughout the 6.2 or 6.3 development series.
> 
> Looking at the rtw89 driver's probe function, the bug is probably a
> simple race condition:
> 
> int rtw89_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> {
>     ...
>     ret = rtw89_core_register(rtwdev); <- calls ieee80211_register_hw();
>     ...
>     rtw89_core_napi_init(rtwdev);
>     ...
> }
> 
> so it registers the wifi device first, making it visible to userspace,
> and then initializes napi.
> 
> So there is a window where a fast userspace may already try to
> interact with the device before the driver got around to initializing
> the napi parts, and then it explodes. At least that is my theory for
> the issue.
> 
> Switching the order of these two functions should avoid it in theory,
> as long as rtw89_core_napi_init() doesn't depend on anything
> rtw89_core_register() does.
> 
> FWIW, registering the irq handler only after registering the device
> also seems suspect, and should probably also happen before that.

Adding a 10 seconds sleep between rtw89_core_register() and
rtw89_core_napi_init(), and I can reproduce this issue:

int rtw89_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
    ...
    ret = rtw89_core_register(rtwdev);
    ...
	msleep(10 * 100);
	...
    rtw89_core_napi_init(rtwdev);
    ...
}

And, as your suggestion, I move the rtw89_core_register() to the last step
of PCI probe. Then, it looks more reasonable that we prepare NAPI and
interrupt handlers before registering netdev. Could you give it a try with
below fix?

diff --git a/pci.c b/pci.c
index fe6c0efc..087de2e0 100644
--- a/pci.c
+++ b/pci.c
@@ -3879,25 +3879,26 @@ int rtw89_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
        rtw89_pci_link_cfg(rtwdev);
        rtw89_pci_l1ss_cfg(rtwdev);

-       ret = rtw89_core_register(rtwdev);
-       if (ret) {
-               rtw89_err(rtwdev, "failed to register core\n");
-               goto err_clear_resource;
-       }
-
        rtw89_core_napi_init(rtwdev);

        ret = rtw89_pci_request_irq(rtwdev, pdev);
        if (ret) {
                rtw89_err(rtwdev, "failed to request pci irq\n");
-               goto err_unregister;
+               goto err_deinit_napi;
+       }
+
+       ret = rtw89_core_register(rtwdev);
+       if (ret) {
+               rtw89_err(rtwdev, "failed to register core\n");
+               goto err_free_irq;
        }

        return 0;

-err_unregister:
+err_free_irq:
+       rtw89_pci_free_irq(rtwdev, pdev);
+err_deinit_napi:
        rtw89_core_napi_deinit(rtwdev);
-       rtw89_core_unregister(rtwdev);
 err_clear_resource:
        rtw89_pci_clear_resource(rtwdev, pdev);
 err_declaim_pci:

--
Ping-Ke






[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux