Re: v3.4-rc4 DSS PM problem (Was: Re: Problems with 3.4-rc5)

"Joe Woodward" <jw@xxxxxxxxxxxxxx> · Tue, 12 Jun 2012 11:15:10 +0100

Was there ever a conclussion to this discussion?

I'm assuming this is unlikely to be fixed in 3.5?

Cheers,
Joe

-----Original Message-----
From: Jean Pihet <jean.pihet@xxxxxxxxxxxxxx>
To: Paul Walmsley <paul@xxxxxxxxx>, Tomi Valkeinen <tomi.valkeinen@xxxxxx>
Cc: Joe Woodward <jw@xxxxxxxxxxxxxx>, khilman@xxxxxx, Archit Taneja <a0393947@xxxxxx>, linux-omap@xxxxxxxxxxxxxxx
Date: Fri, 25 May 2012 14:55:27 +0200
Subject: Re: v3.4-rc4 DSS PM problem (Was: Re: Problems with 3.4-rc5)

> Hi Tomi, Paul!
> 
> On Fri, May 25, 2012 at 10:24 AM, Tomi Valkeinen
> <tomi.valkeinen@xxxxxx> wrote:
> > On Thu, 2012-05-24 at 18:39 -0600, Paul Walmsley wrote:
> >> cc Jean
> >>
> >> Hello Tomi,
> >>
> >> On Wed, 16 May 2012, Tomi Valkeinen wrote:
> >>
> >> > I also suspect that this could be just a plain DSS bug. The
> default FIFO
> >> > low/high thresholds are 960/1023 bytes (i.e. DSS starts refilling
> the
> >> > FIFO when there are 960 or less bytes in the fifo, and stops at
> 1023.
> >> > The fifo is 1024 bytes). The values are calculated with fifo_size
> -
> >> > burst_size and fifo_size - 1.
> >> >
> >> > We are now using FIFO merge features, which combines multiple
> fifos into
> >> > one when possible, making the fifo size 1024*3 = 3072. Using the
> same
> >> > low threshold and increasing the high threshold to 960/3071 works
> fine.
> >> > Changing the high threshold to 3008 causes underflows. Increasing
> the
> >> > low threshold to ~1600 makes DSS work again.
> >>
> >> Just a few thoughts.
> >>
> >> In terms of the high threshold, it seems really strange to me that
> >> changing the high threshold would make such a difference.
>  Naïvely, I'd
> >> assume that you'd want to set it as high as possible?  I suppose in
> cases
> >> where the interconnect is congested, setting it lower might allow
> lower
> >> latency for other interconnect users, but I'd hope we don't have to
> worry
> >> much about that.  So it doesn't seem to me that there would be any
> >> advantage to setting it lower than the maximum.
> >
> > It's true that the high threshold should be set as high as possible,
> and
> > this is what we do. Except for DSI command mode output on OMAP3,
> where,
> > for unknown reason, the highest value (fifosize - 1) doesn't work and
> we
> > need to program it to fifosize - burstsize. And this was causing the
> > original problem, fifosize - burstsize was not working for other
> outputs
> > properly.
> >
> > I guess this also hints that there's something wrong with omap3 and
> the
> > dss fifo thresholds.
> >
> >> Probably the low threshold is the more important parameter, from a
> PM
> >> perspective.  If you know the FIFO's drain rate and the low
> threshold, it
> >> should be possible to calculate the maximum latency that the FIFO
> can
> >> tolerate to avoid an underflow.  This could be used to specify a
> device PM
> >> QoS constraint to prevent the interconnect latency from exceeding
> that
> >> value.
> >
> > Yes, this is how the low threshold should be adjusted. I have never
> > tried to calculate the threshold need, though, as I haven't had all
> the
> > information and understanding to properly calculate it.
> >
> >> I'd guess the calculations would be something like this -- (I hope
> you can
> >> correct my relative ignorance of the DSS in the following
> estimates):
> >>
> >> Looking at mach-omap2/board-rx51-video.c, let's suppose that the
> FIFO
> >> drain rate would be 864 x 480 x 32 bits/second.  Since the FIFO
> width is
> >> 32 bits, that's
> >
> > I think the DSS fifo entries are 8 bit on omap2/3, 128bits on omap4.
> At
> > least those are the "units" used with fifo size, threshold sizes,
> burst
> > size, etc.
> >
> >>    864 x 480 = 414 780 FIFO entries/second, or
> >>
> >>    (1 000 000 µs/s / 414 780 FIFO entries/s) = ~2.411 µs/FIFO
> entry.
> >>
> >> So if you need a low FIFO threshold at 960 entries, you could call
> the
> >> device PM QoS functions to set a wakeup latency constraint for the
> >> interconnect would be nothing greater than this:
> >>
> >>    (2.411 µs/FIFO entry * 960 FIFO entries) = 2 314.96 µs
> >>
> >> (The reality is that it would need to be something less than this,
> to
> >> account for the time needed for the GFX DMA transfer to start
> supplying
> >> data, etc.)
> >
> > Makes sense.
> >
> > Another reason for underflows we have is the different rotation
> engines.
> > VRFB on omap2/3, and TILER on omap4. Both increase the "work" needed
> to
> > get pixels, although I'm not sure what the actual causes for the
> > increased work are.
> >
> >> The ultimate goal, with Jean's device PM QoS patches, is that these
> >> constraints could change the DPLL autoidle settings or powerdomain
> states
> >> to ensure the constraint was met.  He's got a page here:
> Indeed! The core code is ready and the OMAP power domains code is
> under review for the moment. The ultimate goal is to split the overall
> latency of a device into the contributors (SW, HW SoC, HW external
> etc.), so the DPLL relock time would be taken into account. However
> without the submitted code in place there is no way to build the
> feature in incremental steps.
> 
> >>
> >>  
> http://omappedia.org/wiki/Power_Management_Device_Latencies_Measurement
> In the wiki page there is a link to the ELC/Fosdem presentation [1]
> about the new model for the latency.
> [1]
> http://omappedia.org/wiki/File:ELC-2012-jpihet-DeviceLatencyModel.pdf
> 
> >>
> >> (Unfortunately it's not clear what the DPLL autoidle modes and
> voltage
> >> scaling bits are set to for many of the estimates, and we also know
> that
> The code is from an l-o tree + the measurement code in, so the DPLL
> are allowed to auto-idle. In the new model the DPLL relock latency
> contribution should be split from the power domains latency.
> 
> >> there are many software optimizations possible for our idle path.)
> Sure! Recently we have had the case with the C1 cpuidle state.
> Hopefully some simple experimental optimizations did fix the issue.
> 
> Regards,
> Jean
> 
> >>
> >> We're still working on getting the OMAP device PM QoS patches
> merged, but
> >> the Linux core support is there, so you should be able to patch your
> >> drivers to use them -- see for example dev_pm_qos_add_request().
> >
> > Thanks for the pointers, I need to study that.
> >
> >> Just paging through the DSS TRM section, some other settings that
> might be
> >> worth checking are:
> >>
> >> - is DISPC_GFX_ATTRIBUTES.GFXBURSTSIZE set to 16x32?
> >
> > Yes. (8 x 128 on omap4)
> >
> > I presume each DMA burst has a small overhead, so maximizing the
> burst
> > size minimizes the overhead. Do you see any other effect with the
> burst
> > size? I mean, do you see any need to know the burst size value when
> > trying to calculate optimal thresholds?
> >
> >> - is DISPC_GFX_ATTRIBUTES.GFXFIFOPRELOAD set to 1?
> >
> > No. We set it to 0 so that PRELOAD is used. If I've understood right,
> > the problem with using GFXFIFOPRELOAD=1, i.e. high threshold is used
> for
> > preload value, is that the high threshold can be quite high, and the
> > preload needs to happen during vertical blanking. With a small vblank
> > time and high high threshold there may not be enough time for the
> > preload.
> >
> > Then again, I have not verified that. And I'm not sure why it would
> be a
> > problem if the FIFO is not loaded up to the preload value during
> > blanking, presuming we still have enough pixels to proceed normally.
> >
> > For me it would make more sense to always load the fifo to full, so
> > there wouldn't be need for any PRELOAD value at all.
> >
> >> - is DISPC_GFX_PRELOAD.PRELOAD set to the maximum possible value?
> >
> > No, it's left at the default value. But I have tried adjusting this
> (and
> > also changing the GFXFIFOPRELOAD bit), and neither fixed the original
> > problem.
> >
> >> - is DISPC_CONFIG.FIFOFILLING set to 1?
> >
> > No, it's set to 0. With this problem there's only one overlay enabled
> so
> > it shouldn't have any effect.
> >
> >  Tomi
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap"
> in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html