RE: [PATCH v4] Move DWC2 driver out of staging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Paul Zimmerman
> Sent: Monday, February 03, 2014 9:36 AM
> 
>> From: Stephen Warren [mailto:swarren@xxxxxxxxxxxxx]
>> Sent: Saturday, February 01, 2014 7:44 PM
>> 
>> On 02/01/2014 03:00 AM, Andre Heider wrote:
>>> On Fri, Jan 31, 2014 at 11:48:37PM -0700, Stephen Warren wrote:
>>>> On 01/31/2014 11:12 AM, Andre Heider wrote:
>>>>> On Mon, Jan 13, 2014 at 01:50:09PM -0800, Paul Zimmerman wrote:
>>>>>> The DWC2 driver should now be in good enough shape to move out of
>>>>>> staging. I have stress tested it overnight on RPI running mass
>>>>>> storage and Ethernet transfers in parallel, and for several days
>>>>>> on our proprietary PCI-based platform.
>>>> ...
>>>>> this looks just fine, but for whatever reason it breaks sdhci on my rpi.
>>>>> With today's Linus' master the dwc2 controller seems to initialize fine,
>>>>> but I get this upon boot:
>>>>>
>>>>> [    1.783316] sdhci-bcm2835 20300000.sdhci: sdhci_pltfm_init failed -12
>>>>> [    1.794820] sdhci-bcm2835: probe of 20300000.sdhci failed with error -12
>> ...
>>>> This is due to the following code:
>> ...
>>>> What ends up happening, simply due to memory allocation order, is that
>>>> the memory writes inside usb_settoggle() end up setting the SDHCI struct
>>>> platform_device's num_resources to 0, so that it's call to
>>>> platform_get_resource() fails.
>>>> 
>>>> With the DWC2 move patch reverted, some other random piece of memory is
>>>> being corrupted, which just happens not to cause any visible problem.
>>>> Likely it's some other struct platform_device that's already had its
>>>> resources read by the time DWC2 probes and corrupts them.
>>>> 
>>>> (Yes, this was hard to find!)
>>> 
>>> Nice work, but how did you pinpoint this? Am I missing some option/tool
>>> or did I just not stare for long enough?
>> 
>> Well, there was a clear place where an issue was present; the resource
>> lookup in sdhci_pltfm_init() was failing, so I put a bunch of printfs
>> into that function to dump out the data platform_get_resource() used.
>> This clearly pointed at num_resources==0 being the problem. Next, I
>> dumped the same data from the code in drivers/of that sets it up, and it
>> was OK there, so I knew it was getting over-written somewhere. I then
>> basically added hundreds of calls to the same data dumping function
>> throughout kernel functions like really_probe() to track down the
>> location of the problem. Luckily, the behaviour was stable, so I wasn't
>> chasing a race/timing condition. Eventually I narrowed the window to the
>> few lines of code I mentioned in _dwc2_hcd_endpoint_reset(). It would
>> have been much harder if it was e.g. the USB HW DMAing to memory that
>> caused the corruption, so I was lucky:-)
> 
> Nice work Stephen, thanks. I will try to come up with a patch to fix this
> ASAP, along the lines of what Alan suggested.

Stephen, Andre,

Can you test the attached patch, please? It works for my on the Synopsys
PCIe-based FPGA board. Unfortunately my RPI board is currently broken,
so I am unable to test it there to verify it actually fixes the problem
you are seeing.

The dwc2 driver doesn't use the usb_device toggle bits anywhere else,
so the quickest fix is to just remove the problematic code from
_dwc2_hcd_endpoint_reset().

If you give me your tested-bys, I will submit this as a proper patch
to Greg.

-- 
Paul

Attachment: dwc2-toggle.patch
Description: dwc2-toggle.patch


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux