On Mon, Dec 03, 2018 at 02:09:24PM +0100, Niklas Cassel wrote: > On Mon, Dec 03, 2018 at 10:42:19AM +0000, Lorenzo Pieralisi wrote: > > On Sun, Dec 02, 2018 at 12:50:58AM +0100, Niklas Cassel wrote: > > > On Wed, Nov 21, 2018 at 07:24:53PM +0200, Stanimir Varbanov wrote: > > > > Hi, > > > > > > > > On 11/14/18 12:57 AM, Marc Zyngier wrote: > > > > > It recently came to light that the Designware PCIe driver is rather > > > > > broken in the way it handles MSI[1]: > > > > > > > > > > - It masks interrupt by disabling them, meaning that MSIs generated > > > > > during the masked window are simply lost. Oops. > > > > > > > > > > - Acking of the currently pending MSI is done outside of the interrupt > > > > > flow, getting moved around randomly and ultimately breaking the > > > > > driver. Not great. > > > > > > > > > > This series attempts to address this by switching to using the MASK > > > > > register for masking interrupts (!), and move the ack into the > > > > > appropriate callback, giving it a fixed place in the MSI handling > > > > > flow. > > > > > > > > > > Note that this is only compile-tested on my arm64 laptop, as I'm > > > > > travelling and do not have the required HW to test it anyway. I'd > > > > > welcome both review and testing by the interested parties (dwc > > > > > maintainer and users affected by existing bugs). > > > > > > > > > > Thanks, > > > > > > > > > > M. > > > > > > > > > > [1] https://patchwork.kernel.org/patch/10657987/ > > > > > > > > > > Marc Zyngier (3): > > > > > PCI: designware: Use interrupt masking instead of disabling > > > > > PCI: designware: Take lock when ACKing an interrupt > > > > > PCI: designware: Move interrupt acking into the proper callback > > > > > > > > > > .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++------- > > > > > 1 file changed, 14 insertions(+), 8 deletions(-) > > > > > > > > > > > > > for pcie-qcom: > > > > > > > > Tested-by: Stanimir Varbanov <svarbanov@xxxxxxxxxx> > > > > > > Hello PCI folks, > > > > > > Since this is a real bug, we should try get a couple of Tested-by tags > > > before it's too late. > > > It would be nice if v4.20 was released without broken MSIs in this driver. > > > > > > Personally I get confused just by looking at this mail thread. > > > > > > I see 3 patches from Marc and a fix-up from Marc, but I also see > > > a patch from Gustavo, and another patch from Trent. > > > > > > Is seems like Lorenzo has a branch with Marc's 3 patches + Marc's fix-up > > > folded in here: > > > https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/log/?h=test%2Fpci-dwc-msi > > > > > > Perhaps it would be a good idea to send a V2, with proper Fixes-tags, > > > just so that people would know what to test, so that we can start getting > > > those Tested-by tags. > > > > Perhaps it would be a good idea to pull the branch above and test it > > after I have sent three reminders to all DWC host bridge maintainers through > > this email thread. > > > > I have no problem reposting those patches but it is time you started > > testing them, I have already explained what's the issue they are fixing > > in this thread, I do not think a Fixes: tag will add any further degree > > of urgency. > > > > I tested Lorenzo's > https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/log/?h=test%2Fpci-dwc-msi > branch with drivers/pci/controller/dwc/pcie-qcom.c. > > Without this branch, when having an ath10k PCIe endpoint connected, > and simply running the ath10k as a host access point (running hostapd): > watch cat /proc/interrupts | grep ath10k_pci > I consistenly stop getting interrupts in less than a minute. > > With this branch, I've been able to run the same test case successfully > for 30+ minutes. > > Tested-by: Niklas Cassel <niklas.cassel@xxxxxxxxxx> Thank you very much, I need this tag on linux-pci@xxxxxxxxxxxxxxx in reply to this series: https://patchwork.ozlabs.org/project/linux-pci/list/?series=75827 I need more testing done, I encourage other DWC maintainers to test my branch above. I am mulling over it but I may consider this v4.21 material if I do not get enough testing done this week. Thanks, Lorenzo