Re: [PATCH v9 6/9] PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2024-10-18 at 17:47 +0300, Ilpo Järvinen wrote:
> This mostly reverts the commit b4c7d2076b4e ("PCI/LINK: Remove
> bandwidth notification"). An upcoming commit extends this driver
> building PCIe bandwidth controller on top of it.
> 
> The PCIe bandwidth notification were first added in the commit
> e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth
> notification") but later had to be removed. The significant changes
> compared with the old bandwidth notification driver include:
> 
> 1) Don't print the notifications into kernel log, just keep the Link
>    Speed cached in struct pci_bus updated. While somewhat
> unfortunate,
>    the log spam was the source of complaints that eventually lead to
>    the removal of the bandwidth notifications driver (see the links
>    below for further information).
> 
> 2) Besides the Link Bandwidth Management Interrupt, enable also Link
>    Autonomous Bandwidth Interrupt to cover the other source of
>    bandwidth changes.
> 
> 3) Use threaded IRQ with IRQF_ONESHOT to handle Bandwidth
> Notification
>    Interrupts to address the problem fixed in the commit 3e82a7f9031f
>    ("PCI/LINK: Supply IRQ handler so level-triggered IRQs are
> acked")).
> 
> 4) Handle Link Speed updates robustly. Refresh the cached Link Speed
>    when enabling Bandwidth Notification Interrupts, and solve the
> race
>    between Link Speed read and LBMS/LABS update in
>    pcie_bwnotif_irq_thread().
> 
> 5) Use concurrency safe LNKCTL RMW operations.
> 
> 6) The driver is now called PCIe bwctrl (bandwidth controller)
> instead
>    of just bandwidth notifications because of increased scope and
>    functionality within the driver.
> 
> 7) Coexist with the Target Link Speed quirk in
>    pcie_failed_link_retrain(). Provide LBMS counting API for it.
> 
> 8) Tweaks to variable/functions names for consistency and length
>    reasons.
> 
> Bandwidth Notifications enable the cur_bus_speed in the struct
> pci_bus
> to keep track PCIe Link Speed changes.
> 
> Link:
> https://lore.kernel.org/all/20190429185611.121751-1-helgaas@xxxxxxxxxx/
> Link:
> https://lore.kernel.org/linux-pci/20190501142942.26972-1-keith.busch@xxxxxxxxx/
> Link:
> https://lore.kernel.org/linux-pci/20200115221008.GA191037@xxxxxxxxxx/
> Suggested-by: Lukas Wunner <lukas@xxxxxxxxx> # Building bwctrl on top
> of bwnotif
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> ---

Hi Ilpo,

I bisected a v6.13-rc1 boot hang on my personal workstation to this
patch. Sadly I don't have much details like a panic or so because the
boot hangs before any kernel messages, or at least they're not visible
long enough to see. I haven't yet looked into the code as I wanted to
raise awareness first. Since the commit doesn't revert cleanly on
v6.13-rc1 I also haven't tried that yet.

Here are some details on my system:
- AMD Ryzen 9 3900X 
- ASRock X570 Creator Motherboard
- Radeon RX 5600 XT
- Intel JHL7540 Thunderbolt 3 USB Controller (only USB 2 plugged)
- Intel 82599 10 Gigabit NIC with SR-IOV enabled with 2 VFs
- Intel n I211 Gigabit NIC
- Intel Wi-Fi 6 AX200
- Aquantia AQtion AQC107 NIC

If you have patches or things to try just ask.

Thanks,
Niklas






[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux