On Tue, Oct 20, 2020 at 09:46:30PM -0400, Vitaly Prosyak wrote: > > On 2020-10-20 11:04 a.m., Ville Syrjälä wrote: > > On Mon, Oct 19, 2020 at 11:08:27PM -0400, Vitaly Prosyak wrote: > >> On 2020-10-19 3:49 a.m., Pekka Paalanen wrote: > >>> On Fri, 16 Oct 2020 16:50:16 +0300 > >>> Ville Syrjälä<ville.syrjala@xxxxxxxxxxxxxxx> wrote: > >>> > >>>> On Mon, Oct 12, 2020 at 10:11:01AM +0300, Pekka Paalanen wrote: > >>>>> On Fri, 9 Oct 2020 17:20:18 +0300 > >>>>> Ville Syrjälä<ville.syrjala@xxxxxxxxxxxxxxx> wrote: > > <snip> > >>>> There is a slight snag on some Intel platforms that the gamma LUT > >>>> is sitting after the CSC unit, and currently we use the CSC for > >>>> the range compression. > >> Thanks a lot for letting us to know about this! > >> AMD display pipe has always at the end CSC matrix where we apply appropriate range conversion if necessary. > >> > >>>> On glk in particular I*think* we currently just do the wrong > >>>> thing do the range compression before gamma. The same probably > >>>> applies to hsw+ when both gamma and degamma are used at the same > >>>> time. But that is clearly buggy, and we should fix it to either: > >>>> a) return an error, which isn't super awesome since then you > >>>> can't do gamma+limited range at the same time on glk, nor > >>>> gamma+degamma+limited range on hsw+. > >>>> b) for the glk case we could use the hw degamma LUT for the > >>>> gamma, which isn't great becasue the hw gamma and degamma > >>>> LUTs are quite different beasts, and so the hw degamma LUT > >>>> might not be able to do exactly what we need. > >> Do you mean that hw de-gamma LUT build on ROM ( it is not programmable, just select the proper bit)? > > No. The hw degamma LUT is a 1x33 linearly interpolated > > non-decreasing curve. So can't do directcolor type stuff, > > and each RGB channel must have the same gamma. > > > > The hw gamma LUT on the other hand can operate in multiple > > different modes, from which we currently choose the > > 3x1024 non-interpoated mode. Which can do all those > > things the degamma LUT can't do. > > > >>>> On hsw+ we do > >>>> use this trick already to get the gamma+limited range right, > >>>> but on these platforms the hw gamma and degamma LUTs have > >>>> identical capabilities. > >>>> c) do the range compression with the hw gamma LUT instead, which > >>>> of course means we have to combine the user gamma and range > >>>> compression into the same gamma LUT. > >> Nice w/a and in amdgpu we are using also curve concatenations into re gamma LUT. > >> > >> The number of concatenations could be as many as need it and we may take advantage of this in user mode. Does this sounds preliminarily good? > >> > >> Wouldn't the following sentence be interesting for you if the user mode generates 1D LUT points using X axis exponential distribution to avoid > >> unnecessary interpolation in kernel? It may be especially important if curve concatenation is expected? > > Yeah, I think we want a new uapi for gamma stuff that will allow > > userspace to properly calculate things up front for different kinds > > of hw implementations, without the kernel having to interpolate/decimate. > > We've had some discussions/proposals on the list. > > > >>>> So I think c) is what it should be. Would just need to find the time > >>>> to implement it, and figure out how to not totally mess up our > >>>> driver's hw state checker. Hmm, except this won't help at all > >>>> with YCbCr output since we need to apply gamma before the > >>>> RGB->YCbCr conversion (which uses the same CSC again). Argh. > >>>> So YCbCr output would still need option b). > >>>> > >>>> Thankfully icl+ fixed all this by adding a dedicated output CSC > >>>> unit which sits after the gamma LUT in the pipeline. And pre-hsw > >>>> is almost fine as well since the hw has a dedicated fixed function > >>>> thing for the range compression. So the only snag on pre-hsw > >>>> is the YCbCr+degamma+gamma case. > >> Where is the display engine scaler is located on Intel platforms? > >> AMD old ASIC's have a display scaler after display color pipeline ,so the whole color processing can be a bit mess up unless integer scaling is in use. > >> > >> The new ASIC's ( ~5 years already) have scaler before color pipeline. > > We have a somewhat similar situation. > > > > On older hw the scaler tap point is at the end of the pipe, so > > between the gamma LUT and dithering. > > > > On icl+ I think we have two tap points; one between degamma > > LUT and the first pipe CSC, and a second one between the output > > CSC and dithering. The spec calls these non-linear and linear tap > > points. The scaler also gained another linear vs. non-linear > > control knob which affects the precision at which it can operate > > in some form. There's also some other interaction between this and > > another knob ("HDR" mode) which controls the precision of blending > > in the pipe. I haven't yet thought how we should configure all this > > to the best effect. For the moment we leave these scaler settings > > to their defaults, which means using the non-linear tap point and > > non-linear precision setting. The blending precision we adjust > > dynamically depending on which planes are enabled. Only a subset > > of the planes (so called HDR planes) can be enabled when using the > > high precision blending mode. > > > > On icl+ plane scaling also has the two different tap points, but > > this time I think it just depdends on the type of plane used; > > HDR planes have a linear tap point just before blending, SDR > > planes have a non-linear tap point right after the pixels enter > > the plane's pipeline. Older hw again just had the non-linear > > tap point. > > Thanks for the clarification Ville! > > I am not sure if i understood correctly tap points. > > Are you referring that you have full 2 scalers and each-one can do horizontal and vertical scaling? > > The first scaler does scaling in linear space and and the second in non linear. Is it correct? There are two scalers per pipe, each will do the full horz+vert scaling, and each one can be assigned to either: - any HDR plane linear tap point to scale the plane - any SDR plane non-linear tap point to scale the plane - pipe linear pipe tap point to scale the whole crtc output - pipe non-linear tap point to scale the whole crtc output I don't think you're supposed to assign scalers to both of the pipe tap points simultaneously. The registers might allow it though, so could be an interesting experiment :P > I just found thread from Pekka :https://lists.freedesktop.org/archives/wayland-devel/2020-October/041637.html > > regarding integer scaling and other related stuff. > > AMD display engine has always 1 scaler, we do concatenation of two or more scaling transforms into one if it is necessary. > > Old ASIC's do scaling in nonlinear space, new ASIC's in linear space since scaler precision is half float. > > All these questions are become important for hardware composition and if the differences are too big( not sure about this) and it can't be abstracted. > > As one approach , can we think about shared object in user mode for each vendor ( this approach was in android for hardware composition) and this small component can do > > LUT's , scaler coefficients content and other not compatible stuff ) ? The idea has come up before. Getting any kind of acceptance for such a thing across the various userspace components would probably require a full time lobbyist. I think various forms of gamma and CSC should be possible to abstract in a somewhat reasonable way. For scaling we're now moving ahead with the enum prop to specify the filter. If there was a real need we could even try to abstract some kind of filter coefficients uapi as well. I suspect most things would have some kind of polyphase FIR filter. -- Ville Syrjälä Intel _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel