On Mon, Dec 03, 2018 at 10:42:19AM +0000, Lorenzo Pieralisi wrote: > On Sun, Dec 02, 2018 at 12:50:58AM +0100, Niklas Cassel wrote: > > On Wed, Nov 21, 2018 at 07:24:53PM +0200, Stanimir Varbanov wrote: > > > Hi, > > > > > > On 11/14/18 12:57 AM, Marc Zyngier wrote: > > > > It recently came to light that the Designware PCIe driver is rather > > > > broken in the way it handles MSI[1]: > > > > > > > > - It masks interrupt by disabling them, meaning that MSIs generated > > > > during the masked window are simply lost. Oops. > > > > > > > > - Acking of the currently pending MSI is done outside of the interrupt > > > > flow, getting moved around randomly and ultimately breaking the > > > > driver. Not great. > > > > > > > > This series attempts to address this by switching to using the MASK > > > > register for masking interrupts (!), and move the ack into the > > > > appropriate callback, giving it a fixed place in the MSI handling > > > > flow. > > > > > > > > Note that this is only compile-tested on my arm64 laptop, as I'm > > > > travelling and do not have the required HW to test it anyway. I'd > > > > welcome both review and testing by the interested parties (dwc > > > > maintainer and users affected by existing bugs). > > > > > > > > Thanks, > > > > > > > > M. > > > > > > > > [1] https://patchwork.kernel.org/patch/10657987/ > > > > > > > > Marc Zyngier (3): > > > > PCI: designware: Use interrupt masking instead of disabling > > > > PCI: designware: Take lock when ACKing an interrupt > > > > PCI: designware: Move interrupt acking into the proper callback > > > > > > > > .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++------- > > > > 1 file changed, 14 insertions(+), 8 deletions(-) > > > > > > > > > > for pcie-qcom: > > > > > > Tested-by: Stanimir Varbanov <svarbanov@xxxxxxxxxx> > > > > Hello PCI folks, > > > > Since this is a real bug, we should try get a couple of Tested-by tags > > before it's too late. > > It would be nice if v4.20 was released without broken MSIs in this driver. > > > > Personally I get confused just by looking at this mail thread. > > > > I see 3 patches from Marc and a fix-up from Marc, but I also see > > a patch from Gustavo, and another patch from Trent. > > > > Is seems like Lorenzo has a branch with Marc's 3 patches + Marc's fix-up > > folded in here: > > https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/log/?h=test%2Fpci-dwc-msi > > > > Perhaps it would be a good idea to send a V2, with proper Fixes-tags, > > just so that people would know what to test, so that we can start getting > > those Tested-by tags. > > Perhaps it would be a good idea to pull the branch above and test it > after I have sent three reminders to all DWC host bridge maintainers through > this email thread. > > I have no problem reposting those patches but it is time you started > testing them, I have already explained what's the issue they are fixing > in this thread, I do not think a Fixes: tag will add any further degree > of urgency. > I tested Lorenzo's https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/log/?h=test%2Fpci-dwc-msi branch with drivers/pci/controller/dwc/pcie-qcom.c. Without this branch, when having an ath10k PCIe endpoint connected, and simply running the ath10k as a host access point (running hostapd): watch cat /proc/interrupts | grep ath10k_pci I consistenly stop getting interrupts in less than a minute. With this branch, I've been able to run the same test case successfully for 30+ minutes. Tested-by: Niklas Cassel <niklas.cassel@xxxxxxxxxx>