Re: [PATCH v13 1/2] drm/tegra: dc: Support memory bandwidth management

Dmitry Osipenko <digetx@xxxxxxxxx> · Mon, 8 Mar 2021 17:20:54 +0300

06.03.2021 02:02, Michał Mirosław пишет:
> On Fri, Mar 05, 2021 at 12:45:51AM +0300, Dmitry Osipenko wrote:
>> 04.03.2021 02:08, Michał Mirosław пишет:
>>> On Tue, Mar 02, 2021 at 03:44:44PM +0300, Dmitry Osipenko wrote:
>>>> Display controller (DC) performs isochronous memory transfers, and thus,
>>>> has a requirement for a minimum memory bandwidth that shall be fulfilled,
>>>> otherwise framebuffer data can't be fetched fast enough and this results
>>>> in a DC's data-FIFO underflow that follows by a visual corruption.
> [...]
>>>> +	/*
>>>> +	 * Horizontal downscale takes extra bandwidth which roughly depends
>>>> +	 * on the scaled width.
>>>> +	 */
>>>> +	if (src_w > dst_w)
>>>> +		mul = (src_w - dst_w) * bpp / 2048 + 1;
>>>> +	else
>>>> +		mul = 1;
>>>
>>> Does it really need more bandwidth to scale down? Does it read the same
>>> data multiple times just to throw it away?
>> The hardware isn't optimized for downscale, it indeed takes more
>> bandwidth. You'll witness a severe underflow of plane's memory FIFO
>> buffer on trying to downscale 1080p plane to 50x50.
> [...]
> 
> In your example, does it really need 16x the bandwidth compared to
> no scaling case?  The naive way to implement downscaling would be to read
> all the pixels and only take every N-th.  Maybe the problem is that in
> downscaling mode the latency requirements are tighter?  Why would bandwidth
> required be proportional to a difference between the widths (instead e.g.
> to src/dst or dst*cacheline_size)?

Seems you're right, it's actually not the bandwidth. Recently I added
memory client statistics gathering support to grate-kernel for Tegra20
and it shows that the consumed bandwidth is actually lower when plane is
downscaled.

So it should be the latency, which depends on memory frequency, and
thus, on bandwidth. I'll try to improve comment to the code in the next
version, thanks.