Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Grygorii Strashko <grygorii.strashko@xxxxxx> [170113 10:37]:
> 
> 
> On 01/13/2017 12:01 PM, Tony Lindgren wrote:
> > Commit fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
> > together with recent MUSB changes allowed USB and DMA on BeagleBone to idle
> > when no cable is connected. But looks like few corner case issues still
> > remain.
> > 
> > Looks like just by re-plugging USB cable about ten or so times on BeagleBone
> > when configured in USB peripheral mode we can get warnings and eventually
> > trigger an oops in cppi41 DMA:
> > 
> > WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:1154 cppi41_runtime_suspend+
> > x28/0x38 [cppi41]
> > ...
> > 
> > WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:452
> > push_desc_queue+0x94/0x9c [cppi41]
> > ...
> > 
> > Unable to handle kernel NULL pointer dereference at virtual
> > address 00000104
> > pgd = c0004000
> > [00000104] *pgd=00000000
> > Internal error: Oops: 805 [#1] SMP ARM
> > ...
> > [<bf0d92cc>] (cppi41_runtime_resume [cppi41]) from [<c0589838>]
> > (__rpm_callback+0xc0/0x214)
> > [<c0589838>] (__rpm_callback) from [<c05899ac>] (rpm_callback+0x20/0x80)
> > [<c05899ac>] (rpm_callback) from [<c0589460>] (rpm_resume+0x504/0x78c)
> > [<c0589460>] (rpm_resume) from [<c058a1a0>] (pm_runtime_work+0x60/0xa8)
> > [<c058a1a0>] (pm_runtime_work) from [<c0156120>] (process_one_work+0x2b4/0x808)
> > 
> > This is because of a race with runtime PM and cppi41_dma_issue_pending()
> > as reported by Alexandre Bailon <abailon@xxxxxxxxxxxx> in earlier
> > set of patches. Based on mailing list discussions we however came to the
> > conclusion that a different fix from Alexandre's fix is needed in order
> > to guarantee that DMA is really active when we try to use it.
> > 
> > To fix the issue, we need to add a driver specific flag as we otherwise
> > can have -EINPROGRESS state set by runtime PM and can't rely on
> > pm_runtime_active() to tell us when we can use the DMA.
> > 
> > And we need to make sure the DMA transfers get triggered in the queued
> > order. So let's always queue the transfers, then flush the queue
> > from both cppi41_dma_issue_pending() and cppi41_runtime_resume()
> > as suggested by Grygorii Strashko <grygorii.strashko@xxxxxx> in an
> > earlier example patch.
> > 
> > For reference, this is also documented in Documentation/power/runtime_pm.txt
> > in the example at the end of the file as pointed out by Grygorii Strashko
> > <grygorii.strashko@xxxxxx>.
> > 
> > Based on earlier patches from Alexandre Bailon <abailon@xxxxxxxxxxxx>
> > and Grygorii Strashko <grygorii.strashko@xxxxxx> modified based on
> > testing and what was discussed on the mailing lists.
> > 
> > Fixes: fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
> > Cc: Andy Shevchenko <andy.shevchenko@xxxxxxxxx>
> > Cc: Bin Liu <b-liu@xxxxxx>
> > Cc: Grygorii Strashko <grygorii.strashko@xxxxxx>
> > Cc: Kevin Hilman <khilman@xxxxxxxxxxxx>
> > Cc: Patrick Titiano <ptitiano@xxxxxxxxxxxx>
> > Cc: Sergei Shtylyov <sergei.shtylyov@xxxxxxxxxxxxxxxxxx>
> > Reported-by: Alexandre Bailon <abailon@xxxxxxxxxxxx>
> > Signed-off-by: Tony Lindgren <tony@xxxxxxxxxxx>
> > ---
> > 
> > Alexandre & Grygorii, can you guys please review and test?
> 
> I've just tested it on BBB. And it triggers warning from IRQ
> 
> root@am335x-evm:~# [  242.280546] ------------[ cut here ]------------
> [  242.285532] WARNING: CPU: 0 PID: 0 at drivers/dma/cppi41.c:330 cppi41_irq+0x228/0x260 [cppi41]
> [  242.335164] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.10.0-rc3-00560-g51afc29 #133
> [  242.344601] Hardware name: Generic AM33XX (Flattened Device Tree)
> [  242.351084] [<c011013c>] (unwind_backtrace) from [<c010c300>] (show_stack+0x10/0x14)
> [  242.359269] [<c010c300>] (show_stack) from [<c04a0278>] (dump_stack+0xac/0xe0)
> [  242.366919] [<c04a0278>] (dump_stack) from [<c013700c>] (__warn+0xd8/0x104)
> [  242.374282] [<c013700c>] (__warn) from [<c01370e4>] (warn_slowpath_null+0x20/0x28)
> [  242.382299] [<c01370e4>] (warn_slowpath_null) from [<bf1d0384>] (cppi41_irq+0x228/0x260 [cppi41])
> [  242.391714] [<bf1d0384>] (cppi41_irq [cppi41]) from [<c01a6888>] (__handle_irq_event_percpu+0x48/0x3b4)
> [  242.401727] [<c01a6888>] (__handle_irq_event_percpu) from [<c01a6c10>] (handle_irq_event_percpu+0x1c/0x58)
> [  242.412017] [<c01a6c10>] (handle_irq_event_percpu) from [<c01a6c84>] (handle_irq_event+0x38/0x5c)
> [  242.421477] [<c01a6c84>] (handle_irq_event) from [<c01aa004>] (handle_level_irq+0xc0/0x154)
> [  242.430378] [<c01aa004>] (handle_level_irq) from [<c01a5b9c>] (generic_handle_irq+0x20/0x34)
> [  242.439378] [<c01a5b9c>] (generic_handle_irq) from [<c01a60f0>] (__handle_domain_irq+0x64/0xe0)
> [  242.448660] [<c01a60f0>] (__handle_domain_irq) from [<c07f92f0>] (__irq_svc+0x70/0x98)
> [  242.457117] [<c07f92f0>] (__irq_svc) from [<c010836c>] (arch_cpu_idle+0x20/0x3c)
> [  242.464935] [<c010836c>] (arch_cpu_idle) from [<c018bf7c>] (do_idle+0x164/0x218)
> [  242.472748] [<c018bf7c>] (do_idle) from [<c018c39c>] (cpu_startup_entry+0x18/0x1c)
> [  242.480758] [<c018c39c>] (cpu_startup_entry) from [<c0b00c30>] (start_kernel+0x344/0x3bc)
> [  242.489376] ---[ end trace 12c5b6488c1e8c75 ]---
> [  242.496525] usb 1-1.3: USB disconnect, device number 4
> 
> test sequence:
> - plug usb hub + cd card reader
> - plug usb stick in hub
>   37  mount /dev/sdb /media
>    38  ls /media/
>    39  rm /media/data 
>    40  time dd if=/dev/zero of=/media/data bs=128k count=100
>    41  umount /media
> - unplug USB stick - > Warning
> 
> ^ The same sequence without Hub -- > no warnings
> 
> When I plug-unplug USB stick from Hub I can reproduce it pretty often :(
> but it can be different issue :(

Hmm interesting, I'll try it here too. I wonder if adding spin_lock_irqsave/
restore around the WARN_ON() in the interrupt handler is enough to make it
go away?

> Patch by itself looks good

OK

Tony
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux PCI]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux