[cc +Pali] On Thu, 15 Apr 2021 20:02:23 +0200 Ingmar Klein <ingmar_klein@xxxxxx> wrote: > First thanks to you both, Alex and Bjorn! > I am in no way an expert on this topic, so I have to fully rely on your > feedback, concerning this issue. > > If you should have any other solution approach, in form of patch-set, I > would be glad to test it out. Just let me know, what you think might > make sense. > I will wait for your further feedback on the issue. In the meantime I > have my current workaround via quirk entry. > > By the way, my layman's question: > Do you think, that the following topic might also apply for the QCA6174? > https://www.spinics.net/lists/linux-pci/msg106395.html > Or in other words, should a similar approach be tried for the QCA6174 > and if yes, would it bring any benefit at all? > I hope you can excuse me, in case the questions should not make too much > sense. If you run lspci -vvv on your device, what do LnkCap and LnkSta report under the express capability? I wonder if your device even supports >Gen1 speeds, mine does not. I would not expect that patch to be relevant to you based on your report. I understand it to resolve an issue during link retraining to a higher speed on boot, not during a bus reset. Pali can correct if I'm wrong. Thanks, Alex > Am 15.04.2021 um 04:36 schrieb Alex Williamson: > > On Wed, 14 Apr 2021 16:03:50 -0500 > > Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > >> [+cc Alex] > >> > >> On Fri, Apr 09, 2021 at 11:26:33AM +0200, Ingmar Klein wrote: > >>> Edit: Retry, as I did not consider, that my mail-client would make this > >>> party html. > >>> > >>> Dear maintainers, > >>> I recently encountered an issue on my Proxmox server system, that > >>> includes a Qualcomm QCA6174 m.2 PCIe wifi module. > >>> https://deviwiki.com/wiki/AIRETOS_AFX-QCA6174-NX > >>> > >>> On system boot and subsequent virtual machine start (with passed-through > >>> QCA6174), the VM would just freeze/hang, at the point where the ath10k > >>> driver loads. > >>> Quick search in the proxmox related topics, brought me to the following > >>> discussion, which suggested a PCI quirk entry for the QCA6174 in the kernel: > >>> https://forum.proxmox.com/threads/pcie-passthrough-freezes-proxmox.27513/ > >>> > >>> I then went ahead, got the Proxmox kernel source (v5.4.106) and applied > >>> the attached patch. > >>> Effect was as hoped, that the VM hangs are now gone. System boots and > >>> runs as intended. > >>> > >>> Judging by the existing quirk entries for Atheros, I would think, that > >>> my proposed "fix" could be included in the vanilla kernel. > >>> As far as I saw, there is no entry yet, even in the latest kernel sources. > >> This would need a signed-off-by; see > >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?id=v5.11#n361 > >> > >> This is an old issue, and likely we'll end up just applying this as > >> yet another quirk. But looking at c3e59ee4e766 ("PCI: Mark Atheros > >> AR93xx to avoid bus reset"), where it started, it seems to be > >> connected to 425c1b223dac ("PCI: Add Virtual Channel to save/restore > >> support"). > >> > >> I'd like to dig into that a bit more to see if there are any clues. > >> AFAIK Linux itself still doesn't use VC at all, and 425c1b223dac added > >> a fair bit of code. I wonder if we're restoring something out of > >> order or making some simple mistake in the way to restore VC config. > > I don't really have any faith in that bisect report in commit > > c3e59ee4e766. To double check I dug out the card from that commit, > > installed an old Fedora release so I could build kernel v3.13, > > pre-dating 425c1b223dac and tested triggering a bus reset both via > > setpci and by masking PM reset so that sysfs can trigger the bus reset > > path with the kernel save/restore code. Both result in the system > > hanging when the device is accessed either restoring from the kernel > > bus reset or reading from the device after the setpci reset. Thanks, > > > > Alex > > >