On 01/12/2017 03:30 PM, Tony Lindgren wrote: > Commit fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support") > together with recent MUSB changes allowed USB and DMA on BeagleBone to idle > when no cable is connected. But looks like few corner case issues still > remain. > > Looks like just by repluging USB cable about ten or so times on BeagleBone > when configured in USB peripheral mode we can get warnings and eventually > trigger an oops in cppi41 DMA: > > WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:1154 cppi41_runtime_suspend+ > x28/0x38 [cppi41] > ... > > WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:452 > push_desc_queue+0x94/0x9c [cppi41] > ... > > Unable to handle kernel NULL pointer dereference at virtual > address 00000104 > pgd = c0004000 > [00000104] *pgd=00000000 > Internal error: Oops: 805 [#1] SMP ARM > ... > [<bf0d92cc>] (cppi41_runtime_resume [cppi41]) from [<c0589838>] > (__rpm_callback+0xc0/0x214) > [<c0589838>] (__rpm_callback) from [<c05899ac>] (rpm_callback+0x20/0x80) > [<c05899ac>] (rpm_callback) from [<c0589460>] (rpm_resume+0x504/0x78c) > [<c0589460>] (rpm_resume) from [<c058a1a0>] (pm_runtime_work+0x60/0xa8) > [<c058a1a0>] (pm_runtime_work) from [<c0156120>] (process_one_work+0x2b4/0x808) > > This is because of a race with PM runtime and cppi41_dma_issue_pending() > as reported by Alexandre Bailon <abailon@xxxxxxxxxxxx> in earlier > set of patches. Based on mailing list discussions we however came to the > conclusion that a different fix from Alexandre's fix is needed in order > to guarantee that DMA is really active when we try to use it. > > To fix the issue, we need to add a driver specific flag as we otherwise > can have -EINPROGRESS state set by PM runtime and can't rely on > pm_runtime_active() to tell us when we can use the DMA. > > For reference, this is also documented in Documentation/power/runtime_pm.txt > in the example at the end of the file as pointed out by Grygorii Strashko > <grygorii.strashko@xxxxxx>. > > So let's fix the issue by introducing atomic_t active that gets toggled > in runtime_suspend() and runtime_resume(). And then let's replace the > pm_runtime_active() checks with atomic_read(). > > Note that we may also get transactions queued while in -EINPROGRESS > state. So we need to check for new elements in cppi41_runtime_resume() > by replacing list_for_each_entry_safe() with the commonly used test for > while (!list_empty(&cdd->pending)). > > Fixes: fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support") > Cc: Andy Shevchenko <andy.shevchenko@xxxxxxxxx> > Cc: Bin Liu <b-liu@xxxxxx> > Cc: Grygorii Strashko <grygorii.strashko@xxxxxx> > Cc: Kevin Hilman <khilman@xxxxxxxxxxxx> > Cc: Patrick Titiano <ptitiano@xxxxxxxxxxxx> > Cc: Sergei Shtylyov <sergei.shtylyov@xxxxxxxxxxxxxxxxxx> > Reported-by: Alexandre Bailon <abailon@xxxxxxxxxxxx> > Signed-off-by: Tony Lindgren <tony@xxxxxxxxxxx> > --- > drivers/dma/cppi41.c | 23 ++++++++++++++++++----- > 1 file changed, 18 insertions(+), 5 deletions(-) > > diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c > --- a/drivers/dma/cppi41.c > +++ b/drivers/dma/cppi41.c > @@ -1,3 +1,4 @@ > +#include <linux/atomic.h> > #include <linux/delay.h> > #include <linux/dmaengine.h> > #include <linux/dma-mapping.h> > @@ -153,6 +154,8 @@ struct cppi41_dd { > > /* context for suspend/resume */ > unsigned int dma_tdfdq; > + > + atomic_t active; > }; > > #define FIST_COMPLETION_QUEUE 93 > @@ -320,7 +323,8 @@ static irqreturn_t cppi41_irq(int irq, void *data) > int error; > > error = pm_runtime_get(cdd->ddev.dev); > - if (error < 0) > + if (((error != -EINPROGRESS) && error < 0) || > + !atomic_read(&cdd->active)) > dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n", > __func__, error); > > @@ -482,7 +486,7 @@ static void cppi41_dma_issue_pending(struct dma_chan *chan) > return; > } > Sry, but even if it looks better it might still race with PM runtime :( > - if (likely(pm_runtime_active(cdd->ddev.dev))) > + if (likely(atomic_read(&cdd->active))) > push_desc_queue(c); > else - CPU is here (-EINPROGRESS and active == 0) and then preempted - PM runtime will finish cppi41_runtime_resume and clean-up pending descs - CPU return here and adds desc to the pending queue - oops Am I wrong? > pending_desc(c); > @@ -1003,6 +1007,7 @@ static int cppi41_dma_probe(struct platform_device *pdev) > cdd->ddev.dst_addr_widths = CPPI41_DMA_BUSWIDTHS; > cdd->ddev.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST; > cdd->ddev.dev = dev; > + atomic_set(&cdd->active, 0); > INIT_LIST_HEAD(&cdd->ddev.channels); > cpp41_dma_info.dma_cap = cdd->ddev.cap_mask; > > @@ -1151,6 +1156,7 @@ static int __maybe_unused cppi41_runtime_suspend(struct device *dev) > { > struct cppi41_dd *cdd = dev_get_drvdata(dev); > > + atomic_set(&cdd->active, 0); > WARN_ON(!list_empty(&cdd->pending)); > > return 0; > @@ -1159,13 +1165,20 @@ static int __maybe_unused cppi41_runtime_suspend(struct device *dev) > static int __maybe_unused cppi41_runtime_resume(struct device *dev) > { > struct cppi41_dd *cdd = dev_get_drvdata(dev); > - struct cppi41_channel *c, *_c; > + struct cppi41_channel *c; > unsigned long flags; > > + atomic_set(&cdd->active, 1); > + > spin_lock_irqsave(&cdd->lock, flags); > - list_for_each_entry_safe(c, _c, &cdd->pending, node) { > - push_desc_queue(c); > + while (!list_empty(&cdd->pending)) { > + c = list_first_entry(&cdd->pending, > + struct cppi41_channel, > + node); > list_del(&c->node); > + spin_unlock_irqrestore(&cdd->lock, flags); > + push_desc_queue(c); > + spin_lock_irqsave(&cdd->lock, flags); > } > spin_unlock_irqrestore(&cdd->lock, flags); > > -- regards, -grygorii -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html