Re: [PATCH] PCI: dwc: Fix interrupt race in when handling MSI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 09/11/18 3:43 PM, Lorenzo Pieralisi wrote:
> On Thu, Nov 08, 2018 at 07:49:52PM +0000, Trent Piepho wrote:
>> On Thu, 2018-11-08 at 09:49 +0000, Marc Zyngier wrote:
>>> On 07/11/18 20:17, Trent Piepho wrote:
>>>> On Wed, 2018-11-07 at 18:41 +0000, Marc Zyngier wrote:
>>>>> On 06/11/18 19:40, Trent Piepho wrote:
>>>>>>
>>>>>> What about stable kernels that don't have the hierarchical API?
>>>>>
>>>>> My goal is to fix mainline first. Once we have something that works on
>>>>> mainline, we can look at propagating the fix to other versions. But
>>>>> mainline always comes first.
>>>>
>>>> This is a regression that went into 4.14.  Wouldn't the appropriate
>>>> action for the stable series be to undo the regression?
>>>
>>> This is not how stable works. Stable kernels *only* contain patches that
>>> are backported from mainline, and do not take standalone patch.
>>>
>>> Furthermore, your fix is to actually undo someone else's fix. Who is
>>> right? In the absence of any documentation, the answer is "nobody".
>>
>> Little more history to this bug.  The code was originally the way it is
>> now, but this same bug was fixed in 2013 in https://patchwork.kernel.or
>> g/patch/3333681/
>>
>> Then that lasted four years until it was changed Aug 2017 in https://pa
>> tchwork.kernel.org/patch/9893303/
>>
>> That lasted just six months until someone tried to revert it, https://p
>> atchwork.kernel.org/patch/9893303/
> 
> The last link is the same as the previous one, unless I am missing
> something.
> 
>> Seems pretty clear the way it is now is much worse than the way it was
>> before, even if the previous design may have had another flaw.  Though
>> I've yet to see anyone point out something makes the previous design
>> broken.  Sub-optimal yes, but not broken.
> 
> The way I see it is: either the MSI handling works or it does not.
> 
> AFAICS:
> 
> 8c934095fa2f ("PCI: dwc: Clear MSI interrupt status after it is handled,
> not before")
> 
> was fixing a bug, causing "timeouts on some wireless lan cards", we want
> to understand what the problem is, fix it once for all on all DWC
> based systems.
> 

That issue was root caused to be due to a HW errata in dra7xx DWC
wrapper which requires a special way of handling MSI interrupts at
wrapper level. More info in this thread:
https://www.spinics.net/lists/linux-pci/msg70462.html

Unfortunately, commit 8c934095fa2f did not fix WLAN issue in longer
tests and also broke PCIe USB cards. Therefore, it makes sense to revert
8c934095fa2f

I am working on patches fix dra7xx wrapper for WLAN card issue.

Regards
Vignesh

>>> Anything can be backported to stable once we understand the issue. At
>>> the moment, we're just playing games moving stuff around and hope
>>> nothing else will break. That's not a sustainable way of maintaining
>>> this driver. At the moment, the only patch I'm inclined to propose until
>>> we get an actual interrupt handling flow from Synopsys is to mark this
>>> driver as "BROKEN".
>>
>> It feels like you're using this bug to hold designware hostage in a
>> broken kernel, and me along with them.  I don't have the documentation,
>> no one does, there's no way for me to give you want you want.  But I've
>> got hardware that doesn't work in the mainline kernel.
> 
> Nobody is holding anyone hostage here, it is a pretty normal patch
> discussion, given the controversial history of fixes you reported
> we are just trying to get the whole picture.
> 
> There is a bug that ought to be fixed, you are doing the right thing
> with the feedback you are providing and DWC maintainers must provide the
> information you need to get to the bottom of this, once for all, that's
> as simple as that.
> 
> Thanks,
> Lorenzo
> 



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux