Re: [PATCH 2/2] drm/vkms: Use a simpler composition function

Arthur Grillo <arthurgrillo@xxxxxxxxxx> · Wed, 7 Feb 2024 17:21:41 -0300



On 07/02/24 13:03, Louis Chauvet wrote:
> Hello Pekka, Arthur,
> 
> [...]
> 
>>>> Would it be possible to have a standardised benchmark specifically
>>>> for performance rather than correctness, in IGT or where-ever it
>>>> would make sense? Then it would be simple to tell contributors to
>>>> run this and report the numbers before and after.
>>>>
>>>> I would propose this kind of KMS layout:
>>>>
>>>> - CRTC size 3841 x 2161
>>>> - primary plane, XRGB8888, 3639 x 2161 @ 101,0
>>>> - overlay A, XBGR2101010, 3033 x 1777 @ 201,199
>>>> - overlay B, ARGB8888, 1507 x 1400 @ 1800,250
>>>>
>>>> The sizes and positions are deliberately odd to try to avoid happy
>>>> alignment accidents. The planes are big, which should let the pixel
>>>> operations easily dominate performance measurement. There are
>>>> different pixel formats, both opaque and semi-transparent. There is
>>>> lots of plane overlap. The planes also do not cover the whole CRTC
>>>> leaving the background visible a bit.
>>>>
>>>> There should be two FBs per each plane, flipped alternatingly each
>>>> frame. Writeback should be active. Run this a number of frames, say,
>>>> 100, and measure the kernel CPU time taken. It's supposed to take at
>>>> least several seconds in total.
>>>>
>>>> I think something like this should be the base benchmark. One can
>>>> add more to it, like rotated planes, YUV planes, etc. or switch
>>>> settings on the existing planes. Maybe even FB_DAMAGE_CLIPS. Maybe
>>>> one more overlay that is very tall and thin.
>>>>
>>>> Just an idea, what do you all think?  
>>>
>>> Hi Pekka,
>>>
>>> I just finished writing this proposal using IGT.
>>>
>>> I got pretty interesting results:
>>>
>>> The mentioned commit 8356b97906503a02125c8d03c9b88a61ea46a05a took
>>> around 13 seconds. While drm-misc/drm-misc-next took 36 seconds.
>>>
>>> I'm currently bisecting to be certain that the change to the
>>> pixel-by-pixel is the culprit, but I don't see why it wouldn't be.
>>>
>>> I just need to do some final touches on the benchmark code and it
>>> will be ready for revision.
>>
>> Awesome, thank you very much for doing that!
>> pq
> 
> I also think it's a good benchmarks for classic configurations. The odd 
> size is a very nice idea to verify the corner cases of line-by-line 
> algorithms.
> 
> When this is ready, please share the test, so I can check if my patch is 
> as performant as before.
> 
> Thank you for this work.
> 
> Have a nice day,
> Louis Chauvet
> 

Just sent the benchmark for revision:
https://lore.kernel.org/r/20240207-bench-v1-1-7135ad426860@xxxxxxxxxx