RE: [PATCH v15 3/5] drm: renesas: Add RZ/G2L DU Support

Biju Das <biju.das.jz@xxxxxxxxxxxxxx> · Thu, 14 Dec 2023 15:24:17 +0000

Hi Maxime Ripard,

Thanks for the feedback.

> -----Original Message-----
> From: Maxime Ripard <mripard@xxxxxxxxxx>
> Sent: Wednesday, December 13, 2023 3:47 PM
> To: Biju Das <biju.das.jz@xxxxxxxxxxxxxx>
> Subject: Re: [PATCH v15 3/5] drm: renesas: Add RZ/G2L DU Support
> 
> On Tue, Nov 28, 2023 at 10:51:27AM +0000, Biju Das wrote:
> > The LCD controller is composed of Frame Compression Processor (FCPVD),
> > Video Signal Processor (VSPD), and Display Unit (DU).
> >
> > It has DPI/DSI interfaces and supports a maximum resolution of 1080p
> > along with 2 RPFs to support the blending of two picture layers and
> > raster operations (ROPs).
> >
> > The DU module is connected to VSPD. Add RZ/G2L DU support for RZ/G2L
> > alike SoCs.
> >
> > Signed-off-by: Biju Das <biju.das.jz@xxxxxxxxxxxxxx>
> 
> I'd still like a review from Geert or Laurent, I don't know the SoC.

Since Laurent is super busy, maybe Kieran and Jacopo can provide feedback if any.

The display blocks is shown in [1] and the pipe line is as below

Memory-> fcpvd -->VSPD --> DU --> DSI --> Display panel.

Fcpvd: Frame Compression Processor
VSPD: Video Signal Processor, Basically a blitter engine which directly render images to DU
DU: Display Unit.

On R-Car fpvcd, VSPD and (DU with planes) IPs are separate blocks
whereas here it is integrated inside LCDC.

fcpvd and VSPD are in media subsystem and we are reusing the code here.
The vspd is based on display list, it downloads the register contents from linked list in memory
and execute composite operation and generates interrupts for display start and frame end.

du_vsp registers the frame completion callback with vspd driver in media framework.
and we call the drm_crtc_handle_vblank() in this context.

[1]
https://pasteboard.co/MDmbXdK3psSD.png

● FCPVD
− Support out-of-order for the whole outstanding transactions
− Read linear addressing image data
− Read display list data
− Write image data
● VSPD
− Supports various data formats and conversion
− Supports YCbCr444/422/420, RGB, α RGB, α plane
− Color space conversion and changes to the number of colors by dithering
− Color keying
− Supports combination between pixel alpha and global alpha
− Supports generating pre multiplied alpha
− Video processing
− Blending of two picture layers and raster operations (ROPs)
− Clipping
− 1D look up table
− Vertical flipping in case of output to memory
− Direct connection to display module
− Supporting 1920 pixels in horizontal direction
− Writing back image data which is transferred to Display Unit (DU) to memory
● DU
− Supporting Display Parallel Interface (DPI) and MIPI LINK Video Interface
− Display timing master
− Generating video timings (Front porch, Back porch, Sync active, Active video area)
− Selecting the polarity of output DCLK, HSYNC, VSYNC, and DE
− Supporting Progressive (Non-interlace)
− Not supports Interlace
− Input data format (from VSPD): RGB888, RGB666 (not supports dithering of RGB565)
− Output data format: same as Input data format
− Supporting Full HD (1920 pixels × 1080 lines) for MIPI-DSI Output
− Supporting WXGA (1280 pixels × 800 lines) for Parallel Output

> 
> > +static int rzg2l_du_crtc_get(struct rzg2l_du_crtc *rcrtc) {
> > +	int ret;
> > +
> > +	/*
> > +	 * Guard against double-get, as the function is called from both the
> > +	 * .atomic_enable() and .atomic_flush() handlers.
> > +	 */
> > +	if (rcrtc->initialized)
> > +		return 0;
> > +
> > +	ret = clk_prepare_enable(rcrtc->rzg2l_clocks.aclk);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	ret = clk_prepare_enable(rcrtc->rzg2l_clocks.pclk);
> > +	if (ret < 0)
> > +		goto error_bus_clock;
> > +
> > +	ret = reset_control_deassert(rcrtc->rstc);
> > +	if (ret < 0)
> > +		goto error_peri_clock;
> > +
> > +	rzg2l_du_crtc_setup(rcrtc);
> > +	rcrtc->initialized = true;
> > +
> > +	return 0;
> > +
> > +error_peri_clock:
> > +	clk_disable_unprepare(rcrtc->rzg2l_clocks.pclk);
> > +error_bus_clock:
> > +	clk_disable_unprepare(rcrtc->rzg2l_clocks.aclk);
> > +	return ret;
> > +}
> 
> [...]
> 
> > +static void rzg2l_du_crtc_atomic_flush(struct drm_crtc *crtc,
> > +				       struct drm_atomic_state *state) {
> > +	struct rzg2l_du_crtc *rcrtc = to_rzg2l_crtc(crtc);
> > +	struct drm_device *dev = rcrtc->crtc.dev;
> > +	unsigned long flags;
> > +
> > +	WARN_ON(!crtc->state->enable);
> > +
> > +	/*
> > +	 * If a mode set is in progress we can be called with the CRTC
> disabled.
> > +	 * We thus need to first get and setup the CRTC in order to
> configure
> > +	 * planes. We must *not* put the CRTC, as it must be kept awake
> until
> > +	 * the .atomic_enable() call that will follow. The get operation in
> > +	 * .atomic_enable() will in that case be a no-op, and the CRTC will
> be
> > +	 * put later in .atomic_disable().
> > +	 */
> > +	rzg2l_du_crtc_get(rcrtc);
> 
> That's a bit suspicious. Have you looked at
> drm_atomic_helper_commit_tail_rpm() ?

Ok, I will drop this and start using  drm_atomic_helper_commit_tail_rpm()
instead of rzg2l_du_atomic_commit_tail().

> 
> > +static int rzg2l_du_crtc_enable_vblank(struct drm_crtc *crtc) {
> > +	struct rzg2l_du_crtc *rcrtc = to_rzg2l_crtc(crtc);
> > +
> > +	rcrtc->vblank_enable = true;
> > +
> > +	return 0;
> > +}
> > +
> > +static void rzg2l_du_crtc_disable_vblank(struct drm_crtc *crtc) {
> > +	struct rzg2l_du_crtc *rcrtc = to_rzg2l_crtc(crtc);
> > +
> > +	rcrtc->vblank_enable = false;
> > +}
> 
> You should enable / disable your interrupts here

We don't have dedicated vblank IRQ for enabling/disabling vblank.

vblank is handled by vspd.

vspd is directly rendering images to display,
rcar_du_crtc_finish_page_flip() and drm_crtc_handle_vblank()
called in vspd's pageflip context.

See rzg2l_du_vsp_complete()in rzg2l_du_vsp.c

> 
> > +int rzg2l_du_crtc_create(struct rzg2l_du_device *rcdu) {
> > +	struct rzg2l_du_crtc *rcrtc = &rcdu->crtcs[0];
> > +	struct drm_crtc *crtc = &rcrtc->crtc;
> > +	struct drm_plane *primary;
> > +	int ret;
> > +
> > +	rcrtc->rstc = devm_reset_control_get_shared(rcdu->dev, NULL);
> > +	if (IS_ERR(rcrtc->rstc)) {
> > +		dev_err(rcdu->dev, "can't get cpg reset\n");
> > +		return PTR_ERR(rcrtc->rstc);
> > +	}
> > +
> > +	rcrtc->rzg2l_clocks.aclk = devm_clk_get(rcdu->dev, "aclk");
> > +	if (IS_ERR(rcrtc->rzg2l_clocks.aclk)) {
> > +		dev_err(rcdu->dev, "no axi clock for DU\n");
> > +		return PTR_ERR(rcrtc->rzg2l_clocks.aclk);
> > +	}
> > +
> > +	rcrtc->rzg2l_clocks.pclk = devm_clk_get(rcdu->dev, "pclk");
> > +	if (IS_ERR(rcrtc->rzg2l_clocks.pclk)) {
> > +		dev_err(rcdu->dev, "no peripheral clock for DU\n");
> > +		return PTR_ERR(rcrtc->rzg2l_clocks.pclk);
> > +	}
> > +
> > +	rcrtc->rzg2l_clocks.dclk = devm_clk_get(rcdu->dev, "vclk");
> > +	if (IS_ERR(rcrtc->rzg2l_clocks.dclk)) {
> > +		dev_err(rcdu->dev, "no video clock for DU\n");
> > +		return PTR_ERR(rcrtc->rzg2l_clocks.dclk);
> > +	}
> > +
> > +	init_waitqueue_head(&rcrtc->flip_wait);
> > +	rcrtc->dev = rcdu;
> > +
> > +	primary = &rcrtc->vsp->planes[rcrtc->vsp_pipe].plane;
> > +
> > +	ret = drm_crtc_init_with_planes(&rcdu->ddev, crtc, primary, NULL,
> > +					&crtc_funcs_rz, NULL);
> > +	if (ret < 0)
> > +		return ret;
> 
> You should use the drmm variant here

Agreed.

> 
> > +static void rzg2l_du_atomic_commit_tail(struct drm_atomic_state
> > +*old_state) {
> > +	struct drm_device *dev = old_state->dev;
> > +
> > +	/* Apply the atomic update. */
> > +	drm_atomic_helper_commit_modeset_disables(dev, old_state);
> > +	drm_atomic_helper_commit_planes(dev, old_state,
> > +					DRM_PLANE_COMMIT_ACTIVE_ONLY);
> > +	drm_atomic_helper_commit_modeset_enables(dev, old_state);
> > +
> > +	drm_atomic_helper_commit_hw_done(old_state);
> > +	drm_atomic_helper_wait_for_flip_done(dev, old_state);
> > +
> > +	drm_atomic_helper_cleanup_planes(dev, old_state); }
> 
> AFAIK, there's no reason to roll your commit_tail implementation yourself
> here. You should at the very least document why you needed to do so.

Will drop this and use drm_atomic_helper_commit_tail_rpm().

> 
> > +int rzg2l_du_vsp_map_fb(struct rzg2l_du_vsp *vsp, struct
> drm_framebuffer *fb,
> > +			struct sg_table sg_tables[3])
> > +{
> > +	struct rzg2l_du_device *rcdu = vsp->dev;
> > +	unsigned int i, j;
> > +	int ret;
> > +
> > +	for (i = 0; i < fb->format->num_planes; ++i) {
> > +		struct drm_gem_dma_object *gem = drm_fb_dma_get_gem_obj(fb,
> i);
> > +		struct sg_table *sgt = &sg_tables[i];
> > +
> > +		if (gem->sgt) {
> > +			struct scatterlist *src;
> > +			struct scatterlist *dst;
> > +
> > +			/*
> > +			 * If the GEM buffer has a scatter gather table, it has
> > +			 * been imported from a dma-buf and has no physical
> > +			 * address as it might not be physically contiguous.
> > +			 * Copy the original scatter gather table to map it to
> > +			 * the VSP.
> > +			 */
> > +			ret = sg_alloc_table(sgt, gem->sgt->orig_nents,
> > +					     GFP_KERNEL);
> > +			if (ret)
> > +				goto fail;
> > +
> > +			src = gem->sgt->sgl;
> > +			dst = sgt->sgl;
> > +			for (j = 0; j < gem->sgt->orig_nents; ++j) {
> > +				sg_set_page(dst, sg_page(src), src->length,
> > +					    src->offset);
> > +				src = sg_next(src);
> > +				dst = sg_next(dst);
> > +			}
> > +		} else {
> > +			ret = dma_get_sgtable(rcdu->dev, sgt, gem->vaddr,
> > +					      gem->dma_addr, gem->base.size);
> > +			if (ret)
> > +				goto fail;
> > +		}
> > +
> > +		ret = vsp1_du_map_sg(vsp->vsp, sgt);
> > +		if (ret) {
> > +			sg_free_table(sgt);
> > +			goto fail;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +
> > +fail:
> > +	while (i--) {
> > +		struct sg_table *sgt = &sg_tables[i];
> > +
> > +		vsp1_du_unmap_sg(vsp->vsp, sgt);
> > +		sg_free_table(sgt);
> > +	}
> > +
> > +	return ret;
> > +}
> 
> Ditto, you should at least document why you're not using the helpers.

Ok, I will drop these callbacks and directly use drm_fb_dma_get_gem_obj()in
vsp_plane_setup() as there is no iommu here.

+	for (i = 0; i < state->format->planes; ++i) {
+		struct drm_gem_dma_object *gem;
+
+		gem = drm_fb_dma_get_gem_obj(fb, i);
+		cfg.mem[i] = gem->dma_addr + fb->offsets[i];
+	} 

> 
> > +static void rzg2l_du_vsp_cleanup(struct drm_device *dev, void *res) {
> > +	struct rzg2l_du_vsp *vsp = res;
> > +	unsigned int i;
> > +
> > +	for (i = 0; i < vsp->num_planes; ++i) {
> > +		struct rzg2l_du_vsp_plane *plane = &vsp->planes[i];
> > +
> > +		drm_plane_cleanup(&plane->plane);
> > +	}
> > +
> > +	kfree(vsp->planes);
> > +
> > +	put_device(vsp->vsp);
> > +}
> > +
> > +int rzg2l_du_vsp_init(struct rzg2l_du_vsp *vsp, struct device_node *np,
> > +		      unsigned int crtcs)
> > +{
> > +	struct rzg2l_du_device *rcdu = vsp->dev;
> > +	struct platform_device *pdev;
> > +	unsigned int num_crtcs = hweight32(crtcs);
> > +	unsigned int num_planes = 2;
> > +	unsigned int i;
> > +	int ret;
> > +
> > +	/* Find the VSP device and initialize it. */
> > +	pdev = of_find_device_by_node(np);
> > +	if (!pdev)
> > +		return -ENXIO;
> > +
> > +	vsp->vsp = &pdev->dev;
> > +
> > +	ret = drmm_add_action_or_reset(&rcdu->ddev, rzg2l_du_vsp_cleanup,
> vsp);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	ret = vsp1_du_init(vsp->vsp);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	vsp->planes = kcalloc(num_planes, sizeof(*vsp->planes), GFP_KERNEL);
> > +	if (!vsp->planes)
> > +		return -ENOMEM;
> 
> drmm_kcalloc or drmm_kmalloc_array

I get crashes at random places, if I use drmm_kcalloc().
Not sure it is related to this? There is no crash if I use kcalloc.

> 
> > +
> > +	for (i = 0; i < num_planes; ++i) {
> > +		enum drm_plane_type type = i < num_crtcs
> > +					 ? DRM_PLANE_TYPE_PRIMARY
> > +					 : DRM_PLANE_TYPE_OVERLAY;
> > +		struct rzg2l_du_vsp_plane *plane = &vsp->planes[i];
> > +
> > +		plane->vsp = vsp;
> > +		plane->index = i;
> > +		ret = drm_universal_plane_init(&rcdu->ddev, &plane->plane,
> > +					       crtcs, &rzg2l_du_vsp_plane_funcs,
> > +					       rzg2l_du_vsp_formats,
> > +					       ARRAY_SIZE(rzg2l_du_vsp_formats),
> > +					       NULL, type, NULL);
> > +		if (ret < 0)
> > +			return ret;
> 
> you need to use drmm variant here too.

I did rebased to latest drm_misc_next and I don't find the drmm_universal_plane_init()

Can you please point me to the API?

Cheers,
Biju