On Mon, Oct 19, 2020 at 11:08:27PM -0400, Vitaly Prosyak wrote: > > On 2020-10-19 3:49 a.m., Pekka Paalanen wrote: > > On Fri, 16 Oct 2020 16:50:16 +0300 > > Ville Syrjälä<ville.syrjala@xxxxxxxxxxxxxxx> wrote: > > > >> On Mon, Oct 12, 2020 at 10:11:01AM +0300, Pekka Paalanen wrote: > >>> On Fri, 9 Oct 2020 17:20:18 +0300 > >>> Ville Syrjälä<ville.syrjala@xxxxxxxxxxxxxxx> wrote: <snip> > >> There is a slight snag on some Intel platforms that the gamma LUT > >> is sitting after the CSC unit, and currently we use the CSC for > >> the range compression. > > Thanks a lot for letting us to know about this! > AMD display pipe has always at the end CSC matrix where we apply appropriate range conversion if necessary. > > >> > >> On glk in particular I*think* we currently just do the wrong > >> thing do the range compression before gamma. The same probably > >> applies to hsw+ when both gamma and degamma are used at the same > >> time. But that is clearly buggy, and we should fix it to either: > >> a) return an error, which isn't super awesome since then you > >> can't do gamma+limited range at the same time on glk, nor > >> gamma+degamma+limited range on hsw+. > >> b) for the glk case we could use the hw degamma LUT for the > >> gamma, which isn't great becasue the hw gamma and degamma > >> LUTs are quite different beasts, and so the hw degamma LUT > >> might not be able to do exactly what we need. > > Do you mean that hw de-gamma LUT build on ROM ( it is not programmable, just select the proper bit)? No. The hw degamma LUT is a 1x33 linearly interpolated non-decreasing curve. So can't do directcolor type stuff, and each RGB channel must have the same gamma. The hw gamma LUT on the other hand can operate in multiple different modes, from which we currently choose the 3x1024 non-interpoated mode. Which can do all those things the degamma LUT can't do. > > >> On hsw+ we do > >> use this trick already to get the gamma+limited range right, > >> but on these platforms the hw gamma and degamma LUTs have > >> identical capabilities. > >> c) do the range compression with the hw gamma LUT instead, which > >> of course means we have to combine the user gamma and range > >> compression into the same gamma LUT. > > Nice w/a and in amdgpu we are using also curve concatenations into re gamma LUT. > > The number of concatenations could be as many as need it and we may take advantage of this in user mode. Does this sounds preliminarily good? > > Wouldn't the following sentence be interesting for you if the user mode generates 1D LUT points using X axis exponential distribution to avoid > unnecessary interpolation in kernel? It may be especially important if curve concatenation is expected? Yeah, I think we want a new uapi for gamma stuff that will allow userspace to properly calculate things up front for different kinds of hw implementations, without the kernel having to interpolate/decimate. We've had some discussions/proposals on the list. > > >> > >> So I think c) is what it should be. Would just need to find the time > >> to implement it, and figure out how to not totally mess up our > >> driver's hw state checker. Hmm, except this won't help at all > >> with YCbCr output since we need to apply gamma before the > >> RGB->YCbCr conversion (which uses the same CSC again). Argh. > >> So YCbCr output would still need option b). > >> > >> Thankfully icl+ fixed all this by adding a dedicated output CSC > >> unit which sits after the gamma LUT in the pipeline. And pre-hsw > >> is almost fine as well since the hw has a dedicated fixed function > >> thing for the range compression. So the only snag on pre-hsw > >> is the YCbCr+degamma+gamma case. > > Where is the display engine scaler is located on Intel platforms? > AMD old ASIC's have a display scaler after display color pipeline ,so the whole color processing can be a bit mess up unless integer scaling is in use. > > The new ASIC's ( ~5 years already) have scaler before color pipeline. We have a somewhat similar situation. On older hw the scaler tap point is at the end of the pipe, so between the gamma LUT and dithering. On icl+ I think we have two tap points; one between degamma LUT and the first pipe CSC, and a second one between the output CSC and dithering. The spec calls these non-linear and linear tap points. The scaler also gained another linear vs. non-linear control knob which affects the precision at which it can operate in some form. There's also some other interaction between this and another knob ("HDR" mode) which controls the precision of blending in the pipe. I haven't yet thought how we should configure all this to the best effect. For the moment we leave these scaler settings to their defaults, which means using the non-linear tap point and non-linear precision setting. The blending precision we adjust dynamically depending on which planes are enabled. Only a subset of the planes (so called HDR planes) can be enabled when using the high precision blending mode. On icl+ plane scaling also has the two different tap points, but this time I think it just depdends on the type of plane used; HDR planes have a linear tap point just before blending, SDR planes have a non-linear tap point right after the pixels enter the plane's pipeline. Older hw again just had the non-linear tap point. -- Ville Syrjälä Intel _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel