Search Linux Wireless

Re: [BUG v6.2.7] Hitting BUG_ON() on rtw89 wireless driver startup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/22/23 19:59, Ping-Ke Shih wrote:


-----Original Message-----
From: Jonas Gorski <jonas.gorski@xxxxxxxxx>
Sent: Thursday, March 23, 2023 4:52 AM
To: Larry Finger <Larry.Finger@xxxxxxxxxxxx>
Cc: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; linux-wireless@xxxxxxxxxxxxxxx; Ping-Ke
Shih <pkshih@xxxxxxxxxxx>
Subject: Re: [BUG v6.2.7] Hitting BUG_ON() on rtw89 wireless driver startup

On Wed, 22 Mar 2023 at 18:03, Larry Finger <Larry.Finger@xxxxxxxxxxxx> wrote:

On 3/22/23 10:54, Hyeonggon Yoo wrote:

Hello folks,
I've just encountered weird bug when booting Linux v6.2.7

config: attached
dmesg: attached

I'm not sure exactly how to trigger this issue yet because it's not
stably reproducible. (just have encountered randomly when logging in)

At quick look it seems to be related to rtw89 wireless driver or network subsystem.

Your bug is weird indeed, and it does come from rtw89_8852be. My distro has not
yet released kernel 6.2.7, but I have not seen this problem with mainline
kernels throughout the 6.2 or 6.3 development series.

Looking at the rtw89 driver's probe function, the bug is probably a
simple race condition:

int rtw89_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
     ...
     ret = rtw89_core_register(rtwdev); <- calls ieee80211_register_hw();
     ...
     rtw89_core_napi_init(rtwdev);
     ...
}

so it registers the wifi device first, making it visible to userspace,
and then initializes napi.

So there is a window where a fast userspace may already try to
interact with the device before the driver got around to initializing
the napi parts, and then it explodes. At least that is my theory for
the issue.

Switching the order of these two functions should avoid it in theory,
as long as rtw89_core_napi_init() doesn't depend on anything
rtw89_core_register() does.

FWIW, registering the irq handler only after registering the device
also seems suspect, and should probably also happen before that.

Adding a 10 seconds sleep between rtw89_core_register() and
rtw89_core_napi_init(), and I can reproduce this issue:

int rtw89_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
     ...
     ret = rtw89_core_register(rtwdev);
     ...
	msleep(10 * 100);
	...
     rtw89_core_napi_init(rtwdev);
     ...
}

And, as your suggestion, I move the rtw89_core_register() to the last step
of PCI probe. Then, it looks more reasonable that we prepare NAPI and
interrupt handlers before registering netdev. Could you give it a try with
below fix?

diff --git a/pci.c b/pci.c
index fe6c0efc..087de2e0 100644
--- a/pci.c
+++ b/pci.c
@@ -3879,25 +3879,26 @@ int rtw89_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
         rtw89_pci_link_cfg(rtwdev);
         rtw89_pci_l1ss_cfg(rtwdev);

-       ret = rtw89_core_register(rtwdev);
-       if (ret) {
-               rtw89_err(rtwdev, "failed to register core\n");
-               goto err_clear_resource;
-       }
-
         rtw89_core_napi_init(rtwdev);

         ret = rtw89_pci_request_irq(rtwdev, pdev);
         if (ret) {
                 rtw89_err(rtwdev, "failed to request pci irq\n");
-               goto err_unregister;
+               goto err_deinit_napi;
+       }
+
+       ret = rtw89_core_register(rtwdev);
+       if (ret) {
+               rtw89_err(rtwdev, "failed to register core\n");
+               goto err_free_irq;
         }

         return 0;

-err_unregister:
+err_free_irq:
+       rtw89_pci_free_irq(rtwdev, pdev);
+err_deinit_napi:
         rtw89_core_napi_deinit(rtwdev);
-       rtw89_core_unregister(rtwdev);
  err_clear_resource:
         rtw89_pci_clear_resource(rtwdev, pdev);
  err_declaim_pci:

Ping-Ke,

The patch works fine here, but I did not have the problem.

When you submit it, add a Tested-by: Larry Finger<Larry.Finger@xxxxxxxxxxxx> and a Reviewed-by for the same address.

@Jonas: Good catch.

Larry





[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux