On 09/11/18 3:43 PM, Lorenzo Pieralisi wrote: > On Thu, Nov 08, 2018 at 07:49:52PM +0000, Trent Piepho wrote: >> On Thu, 2018-11-08 at 09:49 +0000, Marc Zyngier wrote: >>> On 07/11/18 20:17, Trent Piepho wrote: >>>> On Wed, 2018-11-07 at 18:41 +0000, Marc Zyngier wrote: >>>>> On 06/11/18 19:40, Trent Piepho wrote: >>>>>> >>>>>> What about stable kernels that don't have the hierarchical API? >>>>> >>>>> My goal is to fix mainline first. Once we have something that works on >>>>> mainline, we can look at propagating the fix to other versions. But >>>>> mainline always comes first. >>>> >>>> This is a regression that went into 4.14. Wouldn't the appropriate >>>> action for the stable series be to undo the regression? >>> >>> This is not how stable works. Stable kernels *only* contain patches that >>> are backported from mainline, and do not take standalone patch. >>> >>> Furthermore, your fix is to actually undo someone else's fix. Who is >>> right? In the absence of any documentation, the answer is "nobody". >> >> Little more history to this bug. The code was originally the way it is >> now, but this same bug was fixed in 2013 in https://patchwork.kernel.or >> g/patch/3333681/ >> >> Then that lasted four years until it was changed Aug 2017 in https://pa >> tchwork.kernel.org/patch/9893303/ >> >> That lasted just six months until someone tried to revert it, https://p >> atchwork.kernel.org/patch/9893303/ > > The last link is the same as the previous one, unless I am missing > something. > >> Seems pretty clear the way it is now is much worse than the way it was >> before, even if the previous design may have had another flaw. Though >> I've yet to see anyone point out something makes the previous design >> broken. Sub-optimal yes, but not broken. > > The way I see it is: either the MSI handling works or it does not. > > AFAICS: > > 8c934095fa2f ("PCI: dwc: Clear MSI interrupt status after it is handled, > not before") > > was fixing a bug, causing "timeouts on some wireless lan cards", we want > to understand what the problem is, fix it once for all on all DWC > based systems. > That issue was root caused to be due to a HW errata in dra7xx DWC wrapper which requires a special way of handling MSI interrupts at wrapper level. More info in this thread: https://www.spinics.net/lists/linux-pci/msg70462.html Unfortunately, commit 8c934095fa2f did not fix WLAN issue in longer tests and also broke PCIe USB cards. Therefore, it makes sense to revert 8c934095fa2f I am working on patches fix dra7xx wrapper for WLAN card issue. Regards Vignesh >>> Anything can be backported to stable once we understand the issue. At >>> the moment, we're just playing games moving stuff around and hope >>> nothing else will break. That's not a sustainable way of maintaining >>> this driver. At the moment, the only patch I'm inclined to propose until >>> we get an actual interrupt handling flow from Synopsys is to mark this >>> driver as "BROKEN". >> >> It feels like you're using this bug to hold designware hostage in a >> broken kernel, and me along with them. I don't have the documentation, >> no one does, there's no way for me to give you want you want. But I've >> got hardware that doesn't work in the mainline kernel. > > Nobody is holding anyone hostage here, it is a pretty normal patch > discussion, given the controversial history of fixes you reported > we are just trying to get the whole picture. > > There is a bug that ought to be fixed, you are doing the right thing > with the feedback you are providing and DWC maintainers must provide the > information you need to get to the bottom of this, once for all, that's > as simple as that. > > Thanks, > Lorenzo >