Re: [BUG] imx-sdma: readl_relaxed_poll_timeout_atomic() conversion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 22, 2019 at 08:10:29PM +0200, Michael Olbrich wrote:
> On Sat, Jun 22, 2019 at 05:53:18PM +0100, Russell King - ARM Linux admin wrote:
> > Old code:
> > 
> > -       while (!(ret = readl_relaxed(sdma->regs + SDMA_H_INTR) & 1)) {
> > -               if (timeout-- <= 0)
> > -                       break;
> > -               udelay(1);
> > -       }
> > 
> > So, while bit 0 is _clear_ the loop continues to poll.
> > 
> > 
> > New code:
> > 
> > +       ret = readl_relaxed_poll_timeout_atomic(sdma->regs + SDMA_H_STATSTOP,
> > +                                               reg, !(reg & 1), 1, 500);
> > 
> > Doesn't really tell us what the termination condition is (because of
> > the obfuscation taking away the details), but if we dig into the
> > macro maze:
> > 
> > #define readl_relaxed_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \
> >         readx_poll_timeout_atomic(readl_relaxed, addr, val, cond, delay_us, timeout_us)
> > 
> > #define readx_poll_timeout_atomic(op, addr, val, cond, delay_us, timeout_us) \
> > ({ \
> >         u64 __timeout_us = (timeout_us); \
> >         unsigned long __delay_us = (delay_us); \
> >         ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
> >         for (;;) { \
> >                 (val) = op(addr); \
> >                 if (cond) \
> >                         break; \
> > 
> > "cond" is passed in to here unmodified, so this becomes:
> > 
> > 	for (;;) {
> > 		reg = readl_relaxed(sdma->regs + SDMA_H_STATSTOP);
> > 		if (!(reg & 1))
> > 			break;
> > 
> > So, if bit 0 of this register is clear, we terminate the loop.
> > 
> > Seems to me like this is a great illustration why using a helper
> > _introduces_ bugs, because it hides the detail about what the exit
> > condition for the embedded loop actually is, and leads to this kind
> > of error.
> > 
> > In any case, the conversion is obviously incorrect.
> > 
> > I occasionally see the "Timeout waiting for CH0 ready" error during
> > boot on a cbi4, which, given the above, means that we did end up
> > seeing bit 1 set (so according to the old code, we waited
> > successfully.)
> 
> The old code was polling SDMA_H_INTR so it waited for the bit to be set.
> The new code (as documented in the commit message) polls SDMA_H_STATSTOP
> instead.
> I believe this register is called SDMAARM_STOP_STAT in the reference
> manual. And the documentation states: "Reading this register yields the
> current state of the HE[i] bits".
> And from the documentation of the SDMA "DONE" instruction:
> "Clear HE bit for the current channel, send an interrupt to the Arm
> platform for the current channel and reschedule."
> 
> My interpretation of this is, that waiting for the bit in SDMA_H_STATSTOP
> to become zero has the same effect as waiting for the bit in SDMA_H_INTR to
> be set. Or am I missing something?

So, why do all my iMX6 platforms now randomly spit out:

"imx-sdma 20ec000.sdma: Timeout waiting for CH0 ready"

at boot, whereas they didn't used to with older kernels?  Maybe channel
0 does not clear the HE[0] bit?

The documentation explicitly states that for initialisation, the
following is required:

• Set bit 0 of the SDMA_HSTART register to set HE[0] and allow Channel 0
  to run (assumes EO[0] and DO[0] were both set in previous step). This
  will cause SDMA toload the program RAM and channel contexts configured
  previously.
• Wait for Channel 0 to finish running. This is indicated by HI[0]=1 in
  the SDMA_SDMA_INTR register, or by optional interrupt to the ARM platform.

So, is there a way for a HI bit to be set without clearing the HE bit?
Yes, via the NOTIFY command:

55.5.2.35 NOTIFY (Notify to ARM platform)
Operation:
if (jjj & 4 == 0)
{
  if (jjj&2 == 2)
    HE[CCR] ← 0
  if (jjj&1== 1)
    HI[CCR] ← 1
}
else if (jjj == 4)
  EP[CCR] ← 0
else

So, if jjj is 001 binary, the HE bit can remain set while the HI bit
is cleared.  Maybe the firmware uses this rather than a DONE instruction
when performing the initialisation functions, which means your idea of
going against what is specified in the manual, and using HE[0] instead
of HI[0] is on _very_ shakey ground.

Given that I'm seeing the same issue on _four_ iMX6 platforms here,
I think it's pretty much obvious that your assumptions here are
false.
 
> Michael
> 
> > Looking at the date of the commit, this is almost a three year old
> > bug.
> > 
> > -- 
> > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> > FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
> > According to speedtest.net: 11.9Mbps down 500kbps up
> > 
> 
> -- 
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up



[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux PCI]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux