Re: [PATCH v7 00/27] drm: sun4i: add Display Engine 3.3 (DE33) support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Maxime,

I'd like to pick your brain for the issue below.

Dne sobota, 22. februar 2025 ob 10:28:51 Srednjeevropski standardni čas je Jernej Škrabec napisal(a):
> Dne sobota, 22. februar 2025 ob 03:48:01 Srednjeevropski standardni čas je Ryan Walklin napisal(a):
> > On Sat, 22 Feb 2025, at 7:57 AM, Jernej Škrabec wrote:
> > > Hi Ryan,
> > >
> > > sorry for very late review, but here we go...
> > 
> > No problem, thanks for the review!
> > 
> > > This patchset actually introduces 3 disticnt features, which should IMO 
> > > be separated
> > > and thus making reviewing patches easier.
> > >
> > > 1. native 10-bit YUV420 output (without YUV->RGB->YUV conversions) - 
> > > this is used on
> > >     HDMI for proper 10-bit 4K@60 HDR output
> > > 2. Display Engine 3.3 (DE33) support
> > > 3. AFBC modifier/format support (used for more efficient GPU or VPU 
> > > output rendering)
> > >
> > > If I'm not mistaken, your goal is only DE33 support. 
> > 
> > This is my initial goal, but I would still like good HDMI and video support (including HW decode). 
> > 
> > It took some refactoring and resequencing of (your) existing driver work to get to this series, and I have left out the HDMI and separated the TCON code for the same reason, that initially I am intending to just support RGB operation to an LCD. I do intend however to use the HDMI code either in or out of tree but think that will be a much bigger/more complex change to mainline given the current HDMI code is more invasive to the generic driver.
> > 
> > The YUV and AFBC changes seemed more self-contained and seemed reasonable approaches given that they were both features of the DE3 and up. I am happy either to split these into either 4 or 5 separate patches ie:
> > 
> > 1a. refactor CSC code to prepare for YUV
> > 1b. add YUV for DE3
> > 2. refactor mixer enumeration and regmaps to support DE3+
> > 3. add DE33
> > 4. Add AFBC
> > 
> > My only reluctance is that that adds more work to manage multiple patches which are logically grouped and in terms of ease of review, all these 4 steps are in the current set in that order (apart from AFBC which is completely standalone), and I don't think submitting them separately reduces complexity. It however will reduce reviewer burden as you suggest, but this has been a slow process already.
> 
> Sorry, completely forgot. YUV420 HDMI code relies on my previous work, with which
> Maxime wasn't happy with:
> 
> https://lore.kernel.org/linux-sunxi/20230924192604.3262187-1-jernej.skrabec@xxxxxxxxx/
> 
> So unless switching HDMI to bridge ops is implemented, which also brings format,
> YUV formatter and some other patches just add unused code, which isn't ideal,
> especially if we decide to rework driver before that code can be put in acceptable
> state for all involved.
> 
> From quick look, patches 5-13, 26 should be dropped for now. Not sure about 1-4.
> 
> I'm fine with AFBC support going in, it's just one patch.
> 
> > 
> > I am happy to accept either process but this has been some time already now with lots of stylistic feedback but not much on the correctness of the approach and code, and I am keen to get this landed.
> > 
> > There is are two 
> > > reasons why
> > > I've never sent these patches myself:
> > >
> > > 1. scaling YUV images doesn't work
> > >
> > > Not a problem for any user who doesn't use video player with DRM plane 
> > > rendering,
> > > which I assume is most of them. We can get around of this issue to deny 
> > > scaling
> > > YUV buffers until the issue is sorted out.
> > 
> > Good to know, I hadn't appreciated that. Mostly relevant for video as you say and it would be good to land YUV support without scaling, then extend subsequently, possibly when we get to the video engine?
> 
> Correct.
> 
> Just be aware of one thing. Even if YUV buffer is rendered in 1:1 scale, vi scaler
> still needs to be configured. That's because U and V planes are subsampled and
> need to be scaled to Y plane size. For unknown reason, that works just fine, but
> if Y scale is bigger than 1, it all falls apart.
> 
> This should be implemented in atomic check.
> 
> > 
> > > 2. plane allocations are hackish
> > >
> > > DE33 for the first time introduced option to move planes between the 
> > > mixers. DRM
> > > has also support for this, but for time being static allocation is 
> > > acceptable and tbh,
> > > even dynamic support need appropriate initial setting.
> > > As you might guessed this is done in DE33 clock driver using magic 
> > > values. Proper
> > > way would be to introduce some kind of connection between mixer/plane 
> > > and clock
> > > driver, so initial configuration can be moved to appropriate module 
> > > instead of
> > > being thrown into clock driver. I need to check where to put it and how 
> > > to properly
> > > connect DT nodes. Maybe syscon phandle? I'll do some research.
> > 
> > Thanks, I was not aware of this either.
> 
> Here you have some pointers how this values are actually set:
> https://github.com/orangepi-xunlong/linux-orangepi/blob/orange-pi-5.15-sun55iw3/bsp/drivers/video/sunxi/disp2/disp/de/lowlevel_v33x/de330/de_top.c#L232-L260
> 
> I think this is the biggest issue of this code. As soon as we solve it properly, we
> have an ability to implement all remaining features at a later date.
> 

DE33 has introduced "shared" planes, which can be allocated to any mixer. They
have now distinct memory region and are not included in mixer regions anymore
as it was the case with DE2 and DE3. This patchset maps whole planes region to
mixer0 which was a hack to get quick result and to push problem to a later time,
which is obviously now.

I see two solutions:
1. All mixers map same region for planes. Not sure if this is acceptable, but it's
    far the easiest to implement (already done).
2. Implement separate plane driver. Each mixer would have phandle to plane node
    and could call plane management functions there, like switch plane crtc. This
    to me sounds like a better solution.

What do you think?

Best regards,
Jernej

> > 
> > > There is one glaring issue (when you work on driver and test every 
> > > aspect of it).
> > > DE33 introduced RCQ, which is basically DMA, which copies shadowed 
> > > registers
> > > from memory buffer directly to hw registers. IIRC it does that at 
> > > vblank time. This
> > > tells you that current concept of writing register values at any time 
> > > userspace feels
> > > to do commit is wrong and we've been lucky that it works as well as it 
> > > does. So,
> > > in order to fix this, driver would need quite a big reorganization. 
> > > Introducing
> > > shadow buffers would solve most of the issues, most likely also with 
> > > YUV scaling.
> > > Implementing RCQ would be then relatively simple and even improve 
> > > timings.
> > > Note that even DE3 has occasional issue with YUV scaler and also 
> > > read-modify-read
> > > access to some of its register can produce bogus value and thus corrupt 
> > > image
> > > until next commit.
> > 
> > Thanks, again I wasn't aware. All my testing has shown remarkable stability and there are some downstream users out-of-tree with good feedback, but would be happy to support an effort to improve this.
> 
> Let's land DE33 support first since it's long overdue and then I'm happy to discuss plans for moving forward.
> 
> Best regards,
> Jernej
> 
> > 
> > Regards,
> > 
> > Ryan
> > 
> 
> 
> 
> 
> 









[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux