Re: [PATCH] PCI: dwc: Fix interrupt race in when handling MSI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2018-11-09 at 11:34 +0000, Marc Zyngier wrote:
> On 08/11/18 19:49, Trent Piepho wrote:
> > On Thu, 2018-11-08 at 09:49 +0000, Marc Zyngier wrote:
> > > 
> > Then that lasted four years until it was changed Aug 2017 in https://pa
> > tchwork.kernel.org/patch/9893303/
> > 
> > That lasted just six months until someone tried to revert it, https://p
> > atchwork.kernel.org/patch/9893303/

Should be https://patchwork.kernel.org/patch/10208879/

> > 
> > Seems pretty clear the way it is now is much worse than the way it was
> > before, even if the previous design may have had another flaw.  Though
> > I've yet to see anyone point out something makes the previous design
> > broken.  Sub-optimal yes, but not broken.
> 
> This is not what was reported by the previous submitter. I guess they
> changed this for a reason, no? I'm prepared to admit this is a end-point
> driver bug, but we need to understand what is happening and stop
> changing this driver randomly.

See Vignesh's recent message about the last change.  It was a mistaken
attempt to fix a problem, which it didn't fix, and I think we all agree
it's not right.

Reverting it is not "changing this driver randomly."  And I take that
as a personal offense.  You imply I just applied patches randomly until
something appeared to work.  Maybe you think this is all over my head? 
Far from it.  I traced every part of the interrupt path and thought
through every race.  Hamstrung by lack of docs, I still determined the
behavior that was relevant through empirical analysis.  I only
discovered this was a recent regression and Vignesh's earlier attempt
to revert it after I was done and was trying to determine how the code
got this way in the first place.

> > It feels like you're using this bug to hold designware hostage in a
> > broken kernel, and me along with them.  I don't have the documentation,
> > no one does, there's no way for me to give you want you want.  But I've
> > got hardware that doesn't work in the mainline kernel.
> 
> I take it as a personal offence that I'd be holding anything or anyone
> hostage. I think I have a long enough track record working with the
> Linux kernel not to take any of this nonsense. What's my interest in
> keeping anything in this sorry state? Think about it for a minute.

I'm sorry if you took it that way.  I appreciate that there are still
people who care about fixing things right and don't settle for whatever
the easiest thing is that lets them say they're done, even if that just
leaves time bombs for everyone who comes after.

So I thank you for taking a stand.

But I think it's clear that 8c934095fa2f was a mistake that causes
serious bugs.  That's not a random guess; it's well considered and well
tested.  Not reverting it now isn't helping anyone using stable kernels
with this regression.




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux