Search Linux Wireless

Re: MT7921 Causing Kernel to Freeze after Reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Sean Wang <sean.wang@xxxxxxxxxxxx>

>On Thu, 2022-03-24 at 10:13 +0100, Íñigo Huguet wrote:
>> On Wed, Dec 22, 2021 at 12:52 PM Philippe Schenker <dev@xxxxxxxxxxxx>
>> wrote:
>> >
>> > Hello
>> >
>> > So I received a new notebook recently, this is a Lenovo P14s that
>> > has a Mediatek 7961 network controller inside.
>> >
>> > -----
>> >
>> > 03:00.0 Network controller: MEDIATEK Corp. Device 7961
>> >         Subsystem: Lenovo Device e0bc
>> >         Physical Slot: 0
>> >         Flags: bus master, fast devsel, latency 0, IRQ 91, IOMMU
>> > group
>> > 13
>> >         Memory at 870200000 (64-bit, prefetchable) [size=1M]
>> >         Memory at 870300000 (64-bit, prefetchable) [size=16K]
>> >         Memory at 870304000 (64-bit, prefetchable) [size=4K]
>> >         Capabilities: <access denied>
>> >         Kernel driver in use: mt7921e
>> >         Kernel modules: mt7921e
>> > ------
>> >
>> > I have the issue that on 5.16-rc6 kernel (also on other rcs) it is
>> > always freezing after I issue a "reboot" command. "poweroff"
>> > followed by
>> > a normal power-on works always.
>>
>> I have a bug report with this same behaviour and almost identical
>> kernel logs.: message "Timeout for driver own" followed by traces
>> related to mt7921 dma stuff, indicating bad page state with refcount
>> -1 and "page dumped because: nonzero _refcount", finally causing a
>> crash during boot up, but only after reboot, not after normal power
>> on.
>>
>> It happens always, even with v5.17. Commit 602cc0c9618a (mt76:
>> mt7921e: fix possible probe failure after reboot) doesn't fix the
>> issue.
>>
>> I hadn't been able to verify where the problem exactly is, but my
>> guess is this:
>> - In function mt7921_init_hardware, initialization fails because
>> mt7921e_driver_own doesn't finish before the timeout (thus we see the
>> "Timeout for driver own")
>> - Then, before retrying to init, mt7921_init_hardware calls
>> mt7921e_init_reset, and the latter calls to mt7921_wpdma_reset
>> - That makes a cleanup of the DMA queues before stopping the DMA,
>> which had been enabled short before during probe
>> - Then, my guess is that in the meanwhile, a DMA event arrives with
>> the queues stillI being cleaned up
>>
>> Does it make sense?
>
>After your suggestion I went down the rabbit-hole and bisected this issue. Fortunately I found the commit introducing the issue. Reverting this commit solves the problem for me on v5.17. It is caused around the PCIe ASPM feature.
>
># first bad commit: [bf3747ae2e25dda6a9e6c464a717c66118c588c8] mt76:
>mt7921: enable aspm by default

have you tried the latest firmware to see if it can help with the issue ?

such as https://patchwork.kernel.org/project/linux-mediatek/patch/8e8a3e94ffe7586cec5abe56ba507e1e3ed8b823.1648171096.git.objelf@xxxxxxxxx/

>
>@Felix do I have to report this anywhere else than on here?
>
>Thanks,
>Philippe
>
>>
>> >
>> > Since it freezes and showing multiple Call Traces I included 4 logs
>> > in the attachment, it certainly points always to mt76_dma functions.

<snip>



[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux