Re: [RFC PATCH v2 06/13] drm/msm/dpu: switch RM to use crtc_id rather than enc_id for allocation

Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx> · Fri, 19 May 2023 04:50:37 +0300

On 19/05/2023 02:46, Abhinav Kumar wrote:

On 3/20/2023 6:18 PM, Dmitry Baryshkov wrote:
Up to now the driver has been using encoder to allocate hardware
resources. Switch it to use CRTC id in preparation for the next step.

This decision to use encoder id instead of CRTC has been there 
downstream for quite sometime. So most of the delay in reviewing this 
series was trying to understand why we had this in the first place and 
who knew that story.

You are right that from the perspective of clone mode mapping resources 
to CRTC is correct but using encoder id is what has been working so far 
without too much difficulty with a little more management  But another 
use-case can get simplified with this.

Thank you for the historical perspective and for the feedback.

I think that keeping resource allocation in dpu_encoder was required 
when INTF/WB themselves were allocated through the RM. However as 
INTF/WB blocks are now allocated in a static way, it doesn't make so 
much sense anymore.

There is another angle to this. There are hardware blocks which can do 
writeback and the physical display concurrently. We call it concurrent 
writeback or CWB. This is present even on some of the chipsets already 
supported upstream.

Now, lets say we start a concurrent writeback session , in todays code 
we will allocate the resources with the encoder id of the writeback's 
encoder and the other physical display's encoder.

When the session stops, we can just deallocate the resources of the 
writeback encoder without touching the other encoder. So it will become 
easier to just free up the resources mapped to the encoder.

I have not looked into CWB programming. However from your description it 
would be as easy to do a full reallocation of the pipeline as just 
dropping the CWB/extra encoder. In fact this is what the driver was 
doing: in case of a modeset, drop all old resources and allocate full 
set of resources.

With clone mode implemented with CRTC id to map resources, we will have 
to probably do some extra book-keeping to handle concurrent writeback.

Probably not much. We'd have to describe the topology/requirements and 
then pass that to RM. I have been waiting for this patchset (and up to 
some point the DSC/ctl) to be reviewed before finalizing/submitting the 
patches that rework the CTL interface to use this kind of data structure.

Thats the use-case which gets impacted with this but for others, there 
shouldnt be a major impact from what we see.

That being said, what benefit are you seeing from making that change now 
for this series? Why is it specifically needed for virtual planes?

I see in the commit text that you have mentioned this is in preparation 
for next step and next step talks about clone mode. But clone mode is 
not there yet. So why this change now?

There were several items that triggered this review.

First thing first. Current design allocates resources from 
dpu_encoder_virt_atomic_check(), then in 
dpu_encoder_virt_atomic_mode_set() the driver has to poke manually in 
the cstate and fill CTL and LM. This kept on bugging me for some time. 
The encoder should not be poking into the CRTC state.

Then came the virtual planes. I think you'd agree that for the virtual 
planes we assign SSPPs to the CRTCs. My initial design kept enc_id for 
all the resources except the SSPP blocks (which were allocated per 
crtc_id). This was not perfect from the symmetry point of view.

Above all, filling the cstate in mode_set was too late for atomic_check 
to look into allocated LM to see if it supports MIXER_SOURCESPLIT or 
not. See dpu_plane_atomic_check().

I started by moving the cstate filling from the 
dpu_encoder_virt_atomic_mode_set() to dpu_encoder_virt_atomic_check(). 
And then it just became natural to turn it to be CRTC-centric code. The 
encoder doesn't have to peek into CRTC state. CRTC/plane do not have to 
delay the checks becasuse the necessary data is provided by the other 
party at a later point.

Resource allocation is centered around LMs for the blocks we have seen 
so far like LM, SSPP  but  ....

And LM is a CRTC.

DSC is already an exception because in earlier chipsets we saw that PP 
and DSC go together and cannot be muxed but later on that changed.

Now, I have explained this case but I am not getting why we need to make 
the change for this series. Is it absolutely necessary?

Strictly, we don't. I can work around all the deficiencies of the 
current code. But it is more natural to do it now rather than later.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx>
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c |  16 +--
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h     |  10 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c      | 119 ++++++++++----------
  drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h      |  15 ++-
  4 files changed, 77 insertions(+), 83 deletions(-)

[trimmed the patch contents, it is not relevant to this discussion]

--
With best wishes
Dmitry