Search Linux Wireless

Re: Linux v6.6 sporadic reboot failures with ath9k on i.MX6Q

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 11/29/2023 1:22 AM, Luca Ceresoli wrote:
Hello,

since several weeks I am investigating a sporadic reboot failure on a
custom board based on i.MX6Q. There is an ATH9K Wi-Fi card connected
over PCIe, and the main suspect is the ath9k driver.

Anybody aware of this kind of bug with ath9k?

Some details about my tests follow.

This is on mainline v6.6 Linux, with only the board dts and a defconfig
added. The board dts itself is based on imx6q.dtsi and among others it
adds:

&pcie {
         pinctrl-names = "default";
         pinctrl-0 = <&pinctrl_pcie>;
         reset-gpio = <&gpio2 20 GPIO_ACTIVE_LOW>;
         status = "okay";
};

and:

&iomuxc {
/* ... */
         imx6qdl-sabresd {
/* ... */
                 pinctrl_pcie: pciegrp {
                         fsl,pins = <
                                 MX6QDL_PAD_EIM_A18__GPIO2_IO20  0x1b0b0
                         >;
                 };
/* ... */
         };
};
Reboot usually works fine, but fails randomly in 1-5% of the
cases. The symptom is that the console stops producing any messages
at some random point in the shutdown sequence, even in the middle of a
line.

After about 7000 reboot attempts with different configurations it is
clear that enabling or disabling CONFIG_ATH9K is what makes the
difference:

  1. kernels with CONFIG_ATH9K=n never fail
  2. kernels with CONFIG_ATH9K=y do fail

Kernels built with CONFIG_ATH9K=y do fail even disabling all optional
CONFIG_ATH9K* options (rfkill, pcoem, btcoex and no_eeprom).

Similarly:

  1. removing pcie from the device tree makes reboot work
  2. leaving pcie in the device tree and removing all the peripherals
     not required for booting, reboot does fail

On top of v6.6 I have applied all the potentially related commits from
master that appear as of now (8 in total):

   git log --oneline --reverse --format=%H v6.6..origin/master -- \
       drivers/net/wireless/ath/*.[ch] drivers/net/wireless/ath/ath9k/ \
     | xargs git cherry-pick

and reboot still fails.

I have tested these mainline kernel versions, which no result:
v6.1.60, v5.15.137, v5.10.199, v5.10.

A first look at the ath9k driver code did not show anything obviously
wrong.

Any clues about how to further investigate would be very welcome.

I am obviously available to provide more info.

Do you have a reboot log with "initcall_debug debug" set on the kernel command line and if so, does it always point to the PCI bus shutting down the device drivers, pcie ports and ultimately the root complex?

We have seen something similar before with ath10k_pci and our pcie-brcmstb driver which eventually was a result of having made incorrect assumptions while implementing the platform_driver::shutdown routine. There was a hard hang in ath10k_remove(), I do not recall the details, but we were definitively doing something improper there.

imx6_pcie_shutdown() seems to much simpler, but my first guess would be there.

Hope this helps.
--
Florian




[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux