Re: [PATCH] PCI: dwc: Fix interrupt race in when handling MSI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 08, 2018 at 07:49:52PM +0000, Trent Piepho wrote:
> On Thu, 2018-11-08 at 09:49 +0000, Marc Zyngier wrote:
> > On 07/11/18 20:17, Trent Piepho wrote:
> > > On Wed, 2018-11-07 at 18:41 +0000, Marc Zyngier wrote:
> > > > On 06/11/18 19:40, Trent Piepho wrote:
> > > > > 
> > > > > What about stable kernels that don't have the hierarchical API?
> > > > 
> > > > My goal is to fix mainline first. Once we have something that works on
> > > > mainline, we can look at propagating the fix to other versions. But
> > > > mainline always comes first.
> > > 
> > > This is a regression that went into 4.14.  Wouldn't the appropriate
> > > action for the stable series be to undo the regression?
> > 
> > This is not how stable works. Stable kernels *only* contain patches that
> > are backported from mainline, and do not take standalone patch.
> > 
> > Furthermore, your fix is to actually undo someone else's fix. Who is
> > right? In the absence of any documentation, the answer is "nobody".
> 
> Little more history to this bug.  The code was originally the way it is
> now, but this same bug was fixed in 2013 in https://patchwork.kernel.or
> g/patch/3333681/
> 
> Then that lasted four years until it was changed Aug 2017 in https://pa
> tchwork.kernel.org/patch/9893303/
> 
> That lasted just six months until someone tried to revert it, https://p
> atchwork.kernel.org/patch/9893303/

The last link is the same as the previous one, unless I am missing
something.

> Seems pretty clear the way it is now is much worse than the way it was
> before, even if the previous design may have had another flaw.  Though
> I've yet to see anyone point out something makes the previous design
> broken.  Sub-optimal yes, but not broken.

The way I see it is: either the MSI handling works or it does not.

AFAICS:

8c934095fa2f ("PCI: dwc: Clear MSI interrupt status after it is handled,
not before")

was fixing a bug, causing "timeouts on some wireless lan cards", we want
to understand what the problem is, fix it once for all on all DWC
based systems.

> > Anything can be backported to stable once we understand the issue. At
> > the moment, we're just playing games moving stuff around and hope
> > nothing else will break. That's not a sustainable way of maintaining
> > this driver. At the moment, the only patch I'm inclined to propose until
> > we get an actual interrupt handling flow from Synopsys is to mark this
> > driver as "BROKEN".
> 
> It feels like you're using this bug to hold designware hostage in a
> broken kernel, and me along with them.  I don't have the documentation,
> no one does, there's no way for me to give you want you want.  But I've
> got hardware that doesn't work in the mainline kernel.

Nobody is holding anyone hostage here, it is a pretty normal patch
discussion, given the controversial history of fixes you reported
we are just trying to get the whole picture.

There is a bug that ought to be fixed, you are doing the right thing
with the feedback you are providing and DWC maintainers must provide the
information you need to get to the bottom of this, once for all, that's
as simple as that.

Thanks,
Lorenzo



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux