Re: v3.4-rc4 DSS PM problem (Was: Re: Problems with 3.4-rc5)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2012-05-24 at 18:39 -0600, Paul Walmsley wrote:
> cc Jean
> 
> Hello Tomi,
> 
> On Wed, 16 May 2012, Tomi Valkeinen wrote:
> 
> > I also suspect that this could be just a plain DSS bug. The default FIFO
> > low/high thresholds are 960/1023 bytes (i.e. DSS starts refilling the
> > FIFO when there are 960 or less bytes in the fifo, and stops at 1023.
> > The fifo is 1024 bytes). The values are calculated with fifo_size -
> > burst_size and fifo_size - 1.
> > 
> > We are now using FIFO merge features, which combines multiple fifos into
> > one when possible, making the fifo size 1024*3 = 3072. Using the same
> > low threshold and increasing the high threshold to 960/3071 works fine.
> > Changing the high threshold to 3008 causes underflows. Increasing the
> > low threshold to ~1600 makes DSS work again.
> 
> Just a few thoughts.
> 
> In terms of the high threshold, it seems really strange to me that 
> changing the high threshold would make such a difference.  Naïvely, I'd 
> assume that you'd want to set it as high as possible?  I suppose in cases 
> where the interconnect is congested, setting it lower might allow lower 
> latency for other interconnect users, but I'd hope we don't have to worry 
> much about that.  So it doesn't seem to me that there would be any 
> advantage to setting it lower than the maximum.

It's true that the high threshold should be set as high as possible, and
this is what we do. Except for DSI command mode output on OMAP3, where,
for unknown reason, the highest value (fifosize - 1) doesn't work and we
need to program it to fifosize - burstsize. And this was causing the
original problem, fifosize - burstsize was not working for other outputs
properly.

I guess this also hints that there's something wrong with omap3 and the
dss fifo thresholds.

> Probably the low threshold is the more important parameter, from a PM 
> perspective.  If you know the FIFO's drain rate and the low threshold, it 
> should be possible to calculate the maximum latency that the FIFO can 
> tolerate to avoid an underflow.  This could be used to specify a device PM 
> QoS constraint to prevent the interconnect latency from exceeding that 
> value.

Yes, this is how the low threshold should be adjusted. I have never
tried to calculate the threshold need, though, as I haven't had all the
information and understanding to properly calculate it.

> I'd guess the calculations would be something like this -- (I hope you can 
> correct my relative ignorance of the DSS in the following estimates):
> 
> Looking at mach-omap2/board-rx51-video.c, let's suppose that the FIFO 
> drain rate would be 864 x 480 x 32 bits/second.  Since the FIFO width is 
> 32 bits, that's

I think the DSS fifo entries are 8 bit on omap2/3, 128bits on omap4. At
least those are the "units" used with fifo size, threshold sizes, burst
size, etc.

>    864 x 480 = 414 780 FIFO entries/second, or
> 
>    (1 000 000 µs/s / 414 780 FIFO entries/s) = ~2.411 µs/FIFO entry.
> 
> So if you need a low FIFO threshold at 960 entries, you could call the 
> device PM QoS functions to set a wakeup latency constraint for the 
> interconnect would be nothing greater than this:
> 
>    (2.411 µs/FIFO entry * 960 FIFO entries) = 2 314.96 µs
> 
> (The reality is that it would need to be something less than this, to 
> account for the time needed for the GFX DMA transfer to start supplying 
> data, etc.)

Makes sense.

Another reason for underflows we have is the different rotation engines.
VRFB on omap2/3, and TILER on omap4. Both increase the "work" needed to
get pixels, although I'm not sure what the actual causes for the
increased work are.

> The ultimate goal, with Jean's device PM QoS patches, is that these 
> constraints could change the DPLL autoidle settings or powerdomain states 
> to ensure the constraint was met.  He's got a page here:
> 
>   http://omappedia.org/wiki/Power_Management_Device_Latencies_Measurement
> 
> (Unfortunately it's not clear what the DPLL autoidle modes and voltage 
> scaling bits are set to for many of the estimates, and we also know that 
> there are many software optimizations possible for our idle path.)
> 
> We're still working on getting the OMAP device PM QoS patches merged, but 
> the Linux core support is there, so you should be able to patch your 
> drivers to use them -- see for example dev_pm_qos_add_request().

Thanks for the pointers, I need to study that.

> Just paging through the DSS TRM section, some other settings that might be 
> worth checking are:
> 
> - is DISPC_GFX_ATTRIBUTES.GFXBURSTSIZE set to 16x32?

Yes. (8 x 128 on omap4)

I presume each DMA burst has a small overhead, so maximizing the burst
size minimizes the overhead. Do you see any other effect with the burst
size? I mean, do you see any need to know the burst size value when
trying to calculate optimal thresholds?

> - is DISPC_GFX_ATTRIBUTES.GFXFIFOPRELOAD set to 1?

No. We set it to 0 so that PRELOAD is used. If I've understood right,
the problem with using GFXFIFOPRELOAD=1, i.e. high threshold is used for
preload value, is that the high threshold can be quite high, and the
preload needs to happen during vertical blanking. With a small vblank
time and high high threshold there may not be enough time for the
preload.

Then again, I have not verified that. And I'm not sure why it would be a
problem if the FIFO is not loaded up to the preload value during
blanking, presuming we still have enough pixels to proceed normally.

For me it would make more sense to always load the fifo to full, so
there wouldn't be need for any PRELOAD value at all.

> - is DISPC_GFX_PRELOAD.PRELOAD set to the maximum possible value?

No, it's left at the default value. But I have tried adjusting this (and
also changing the GFXFIFOPRELOAD bit), and neither fixed the original
problem.

> - is DISPC_CONFIG.FIFOFILLING set to 1?

No, it's set to 0. With this problem there's only one overlay enabled so
it shouldn't have any effect.

 Tomi

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux