Re: [PATCH v2 05/22] drm/msm/dpu: move resource allocation to CRTC

Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx> · Thu, 26 Sep 2024 10:12:37 +0300

On Wed, Sep 25, 2024 at 02:49:48PM GMT, Abhinav Kumar wrote:
> On 9/25/2024 2:11 PM, Dmitry Baryshkov wrote:
> > On Wed, 25 Sept 2024 at 22:39, Jessica Zhang <quic_jesszhan@xxxxxxxxxxx> wrote:
> > > On 9/24/2024 4:13 PM, Dmitry Baryshkov wrote:
> > > > On Tue, Sep 24, 2024 at 03:59:21PM GMT, Jessica Zhang wrote:
> > > > > From: Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx>
> > > > > 
> > > > > All resource allocation is centered around the LMs. Then other blocks
> > > > > (except DSCs) are allocated basing on the LMs that was selected, and LM
> > > > > powers up the CRTC rather than the encoder.
> > > > > 
> > > > > Moreover if at some point the driver supports encoder cloning,
> > > > > allocating resources from the encoder will be incorrect, as all clones
> > > > > will have different encoder IDs, while LMs are to be shared by these
> > > > > encoders.
> > > > > 
> > > > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx>
> > > > > [quic_abhinavk@xxxxxxxxxxx: Refactored resource allocation for CDM]
> > > > > Signed-off-by: Abhinav Kumar <quic_abhinavk@xxxxxxxxxxx>
> > > > > [quic_jesszhan@xxxxxxxxxxx: Changed to grabbing exising global state]
> > > > > Signed-off-by: Jessica Zhang <quic_jesszhan@xxxxxxxxxxx>
> > > > > ---
> > > > >    drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c    |  86 ++++++++++++
> > > > >    drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 201 +++++++++++-----------------
> > > > >    drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h |  19 +++
> > > > >    3 files changed, 183 insertions(+), 123 deletions(-)
> > > > > 
> > > > > @@ -544,159 +542,117 @@ void dpu_encoder_helper_split_config(
> > > > >       }
> > > > >    }
> > > > > 
> > > > > -bool dpu_encoder_use_dsc_merge(struct drm_encoder *drm_enc)
> > > > > +void dpu_encoder_update_topology(struct drm_encoder *drm_enc,
> > > > > +                             struct msm_display_topology *topology,
> > > > > +                             struct drm_atomic_state *state,
> > > > > +                             const struct drm_display_mode *adj_mode)
> > > > >    {
> > > > >       struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
> > > > > -    int i, intf_count = 0, num_dsc = 0;
> > > > > +    struct drm_connector *connector;
> > > > > +    struct drm_connector_state *conn_state;
> > > > > +    struct msm_display_info *disp_info;
> > > > > +    struct drm_framebuffer *fb;
> > > > > +    struct msm_drm_private *priv;
> > > > > +    int i;
> > > > > 
> > > > >       for (i = 0; i < MAX_PHYS_ENCODERS_PER_VIRTUAL; i++)
> > > > >               if (dpu_enc->phys_encs[i])
> > > > > -                    intf_count++;
> > > > > +                    topology->num_intf++;
> > > > > 
> > > > > -    /* See dpu_encoder_get_topology, we only support 2:2:1 topology */
> > > > > +    /* We only support 2 DSC mode (with 2 LM and 1 INTF) */
> > > > >       if (dpu_enc->dsc)
> > > > > -            num_dsc = 2;
> > > > > +            topology->num_dsc += 2;
> > > > > 
> > > > > -    return (num_dsc > 0) && (num_dsc > intf_count);
> > > > > -}
> > > > > +    connector = drm_atomic_get_new_connector_for_encoder(state, drm_enc);
> > > > > +    if (!connector)
> > > > > +            return;
> > > > > +    conn_state = drm_atomic_get_new_connector_state(state, connector);
> > > > > +    if (!conn_state)
> > > > > +            return;
> > > > > 
> > > > > -struct drm_dsc_config *dpu_encoder_get_dsc_config(struct drm_encoder *drm_enc)
> > > > > -{
> > > > > -    struct msm_drm_private *priv = drm_enc->dev->dev_private;
> > > > > -    struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
> > > > > -    int index = dpu_enc->disp_info.h_tile_instance[0];
> > > > > +    disp_info = &dpu_enc->disp_info;
> > > > > 
> > > > > -    if (dpu_enc->disp_info.intf_type == INTF_DSI)
> > > > > -            return msm_dsi_get_dsc_config(priv->dsi[index]);
> > > > > +    priv = drm_enc->dev->dev_private;
> > > > > 
> > > > > -    return NULL;
> > > > > +    /*
> > > > > +     * Use CDM only for writeback or DP at the moment as other interfaces cannot handle it.
> > > > > +     * If writeback itself cannot handle cdm for some reason it will fail in its atomic_check()
> > > > > +     * earlier.
> > > > > +     */
> > > > > +    if (disp_info->intf_type == INTF_WB && conn_state->writeback_job) {
> > > > > +            fb = conn_state->writeback_job->fb;
> > > > > +
> > > > > +            if (fb && MSM_FORMAT_IS_YUV(msm_framebuffer_format(fb)))
> > > > > +                    topology->needs_cdm = true;
> > > > > +    } else if (disp_info->intf_type == INTF_DP) {
> > > > > +            if (msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]], adj_mode))
> > > > > +                    topology->needs_cdm = true;
> > > > > +    }
> > > > 
> > > > Just to note, the needs_cdm is not enough once you introduce CWB
> > > > support. E.g. DP/YUV420 + WB/YUV case requires two CDM blocks (which we
> > > > don't have), but this doesn't get reflected in the topology.
> > > 
> > > Hi Dmitry,
> > > 
> > > Ack. I can add something to make atomic_check fail if the input FB is
> > > YUV format and CWB is enabled.
> > 
> > I'd prefer for this to be more natural rather than just checking for
> > the DP && DP_YUV420 && WB && WB_FMT_YUV. In the worst case, count CDM
> > requests and then in RM check them against the catalog. But I had a
> > more logical (although more intrusive) implementation on my mind:
> > 
> > struct msm_display_topology {
> >      struct {
> >        u32 num_intf;
> >        u32 num_wb;
> >        u32 num_dsc;
> >        bool needs_cdm;
> >      } outputs[MAX_OUTPUTS];
> >      u32 num_lm;
> > };
> > 
> > WDYT?
> > 
> 
> the struct msm_display_topology was originally designed as a per-encoder
> struct (dpu_encoder_get_topology() indicates the same). Making this an array
> of outputs was not needed as there is expected to be one struct
> msm_display_topology for each virt encoder's requested topology and the
> blocks inside of it other than LM, are "encoder" hw blocks.
> 
> needs_cdm was made a boolean instead of a num_cdm_count like other hardware
> blocks because till the most recent chipset, we have only one CDM block.
> Whenever we do have more CDM blocks why will introducing num_cdm to the
> topology struct not solve your problem rather than making it an array of
> outputs?
> 
> Because then, RM will know that the request exceeds the max blocks.
> 
> I think what you are trying to do now is make struct msm_display_topology's
> encoder parts per-encoder and rest like num_lm per "RM session".
> 
> The thought is not wrong but at the same time seems a bit of an overkill
> because its mostly already like that. Apart from CDM for which I have no
> indication of another one getting added, rest of the blocks are already
> aligned towards a per-encoder model and not a "RM session" model.

But we should be leaning towards RM session.

> 
> Even if we end up doing it this way, most of the model is kind of unused
> really because each encoder will request its own topology anyway, there is
> just no aggregation for CDM which at this point is not needed as there is no
> HW we are aware of needing this.

With the resource allocation shifted to the CRTC individual encoders
do not request their own topology (as it is now a property of the full
output pipeline, not just an encoder). Yes, CDM aggregation into num_cdm
seems unnecessary as there is just one CDM block.

> I think the atomic_check validation is needed either way because if two
> encoders request cdm, we cannot do clone mode as there is only one cdm block
> today. Its just that we are not tracking num_cdm today due to reasons
> explained above but basically doing something like below seems right to me:
> 
> if (enc_is_in_clone_mode && needs_cdm)
> 	return -ENOTSUPPORTED;

This check is incorrect in my opinion. The hardware should be able to
support DP/YUV420 + WB/RGB and DP/RGB + WB/YUV combinations. Please
correct me if I'm wrong.

> When we add more cdm_blocks, we can drop this check and making needs_cdm a
> num_cdm will make it naturally fail.

-- 
With best wishes
Dmitry