Re: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

Hans Verkuil <hverkuil@xxxxxxxxx> · Tue, 14 Jul 2015 11:35:30 +0200

On 07/14/15 11:11, Daniel Vetter wrote:
> On Tue, Jul 14, 2015 at 10:17:09AM +0200, Hans Verkuil wrote:
>> On 07/13/15 16:07, Daniel Vetter wrote:
>>> On Mon, Jul 13, 2015 at 12:11:08PM +0200, Hans Verkuil wrote:
>>>> On 07/13/2015 11:54 AM, Daniel Vetter wrote:
>>>>> On Mon, Jul 13, 2015 at 11:43:31AM +0200, Hans Verkuil wrote:
>>>>>> On 07/13/2015 11:18 AM, Daniel Vetter wrote:
>>>>>>> On Mon, Jul 13, 2015 at 10:29:32AM +0200, Hans Verkuil wrote:
>>>>>>>> On 06/15/2015 08:53 AM, Daniel Vetter wrote:
>>>>>>>>> On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote:
>>>>>>>>>> On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote:
>>>>>>>>>>> From: Kausal Malladi <Kausal.Malladi@xxxxxxxxx>
>>>>>>>>>>>
>>>>>>>>>>> This patch set adds color manager implementation in drm/i915 layer.
>>>>>>>>>>> Color Manager is an extension in i915 driver to support color 
>>>>>>>>>>> correction/enhancement. Various Intel platforms support several
>>>>>>>>>>> color correction capabilities. Color Manager provides abstraction
>>>>>>>>>>> of these properties and allows a user space UI agent to 
>>>>>>>>>>> correct/enhance the display.
>>>>>>>>>>
>>>>>>>>>> So I did a first rough pass on the API itself. The big question that
>>>>>>>>>> isn't solved at the moment is: do we want to try to do generic KMS
>>>>>>>>>> properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels:
>>>>>>>>>>
>>>>>>>>>>   1/ Generic for all KMS drivers
>>>>>>>>>>   2/ Generic for i915 supported platfoms
>>>>>>>>>>   3/ Specific to each platform
>>>>>>>>>>
>>>>>>>>>> At this point, I'm quite tempted to say we should give 1/ a shot. We
>>>>>>>>>> should be able to have pre-LUT + matrix + post-LUT on CRTC objects and
>>>>>>>>>> guarantee that, when the drivers expose such properties, user space can
>>>>>>>>>> at least give 8 bits LUT + 3x3 matrix + 8 bits LUT.
>>>>>>>>>>
>>>>>>>>>> It may be possible to use the "try" version of the atomic ioctl to
>>>>>>>>>> explore the space of possibilities from a generic user space to use
>>>>>>>>>> bigger LUTs as well. A HAL layer (which is already there in some but not
>>>>>>>>>> all OSes) would still be able to use those generic properties to load
>>>>>>>>>> "precision optimized" LUTs with some knowledge of the hardware.
>>>>>>>>>
>>>>>>>>> Yeah, imo 1/ should be doable. For the matrix we should be able to be
>>>>>>>>> fully generic with a 16.16 format. For gamma one option would be to have
>>>>>>>>
>>>>>>>> I know I am late replying, apologies for that.
>>>>>>>>
>>>>>>>> I've been working on CSC support for V4L2 as well (still work in progress)
>>>>>>>> and I would like to at least end up with the same low-level fixed point
>>>>>>>> format as DRM so we can share matrix/vector calculations.
>>>>>>>>
>>>>>>>> Based on my experiences I have concerns about the 16.16 format: the precision
>>>>>>>> is quite low which can be a problem when such values are used in matrix
>>>>>>>> multiplications.
>>>>>>>>
>>>>>>>> In addition, while the precision may be sufficient for 8 bit color component
>>>>>>>> values, I'm pretty sure it will be insufficient when dealing with 12 or 16 bit
>>>>>>>> color components.
>>>>>>>>
>>>>>>>> In earlier versions of my CSC code I used a 12.20 format, but in the latest I
>>>>>>>> switched to 32.32. This fits nicely in a u64 and it's easy to extract the
>>>>>>>> integer and fractional parts.
>>>>>>>>
>>>>>>>> If this is going to be a generic and future proof API, then my suggestion
>>>>>>>> would be to increase the precision of the underlying data type.
>>>>>>>
>>>>>>> We discussed this a bit more internally and figured it would be nice to have the same
>>>>>>> fixed point for both CSC matrix and LUT/gamma tables. Current consensus
>>>>>>> seems to be to go with 8.24 for both. Since LUTs are fairly big I think it
>>>>>>> makes sense if we try to be not too wasteful (while still future-proof
>>>>>>> ofc).
>>>>>>
>>>>>> The .24 should have enough precision, but I am worried about the 8: while
>>>>>> this works for 8 bit components, you can't use it to represent values
>>>>>>> 255, which might be needed (now or in the future) for 10, 12 or 16 bit
>>>>>> color components.
>>>>>>
>>>>>> It's why I ended up with 32.32: it's very generic so usable for other
>>>>>> things besides CSC.
>>>>>>
>>>>>> Note that 8.24 is really 7.24 + one sign bit. So 255 can't be represented
>>>>>> in this format.
>>>>>>
>>>>>> That said, all values I'm working with in my current code are small integers
>>>>>> (say between -4 and 4 worst case), so 8.24 would work. But I am not at all
>>>>>> confident that this is future proof. My gut feeling is that you need to be
>>>>>> able to represent at least the max component value + a sign bit + 7 decimals
>>>>>> precision. Which makes 17.24.
>>>>>
>>>>> The idea is to steal from GL and always normalize everything to [0.0,
>>>>> 1.0], irrespective of the source color format. We need that in drm since
>>>>> if you blend together planes with different formats it's completely
>>>>> undefined which one you should pick. 8 bits of precision for values out of
>>>>> range should be enough ;-)
>>>>
>>>> That doesn't really help much, using a [0-1] range just means that you need
>>>> more precision for the fraction since the integer precision is now added to
>>>> the fractional precision.
>>>>
>>>> So for 16-bit color components the 8.24 format will leave you with only 8 bits
>>>> precision if you scale each component to the [0-1] range. That's slightly more
>>>> than 2 decimals. I don't believe that is enough. If you do a gamma table lookup
>>>> and then feed the result to a CSC matrix you need more precision if you want
>>>> to get accurate results.
>>>
>>> Hm, why do we need 8 bits more precision than source data? At least in the
>>> intel hw I've seen the most bits we can stuff into the hw is 0.12 (again
>>> for rescaled range to 0.0-1.0). 24 bits means as-is we'll throw 12 bits
>>> away. What would you want to use these bits for?
>>
>> The intel hardware uses 12 bits today, but what about the next-gen? If you are
>> defining an API and data type just for the hardware the kernel supports today,
>> then 12 bits might be enough precision. If you want to be future proof then you
>> need to be prepared for more capable future hardware.
>>
>> So 0.12 will obviously not be enough if you want to support 16 bit color components
>> in the future.
>>
>> In addition, to fully support HW colorspace conversion (e.g. sRGB to Rec.709) where
>> lookup tables are used for implementing the transfer functions (normal and inverse),
>> then you need more precision then just the number of bits per component or you will
>> get quite large errors in the calculation.
>>
>> It all depends how a LUT is used: if the value from the LUT is the 'final' value,
>> then you don't need more precision than the number of bits of a color component. But
>> if it is used in other calculations (3x3 matrices, full/limited range scaling, etc),
>> then the LUT should provide more bits precision.
>>
>> Which seems to be the case with Intel hardware: 12 bits is 4 bits more than the 8 bits
>> per component it probably uses.
> 
> Intel hw supports a 12bpp pixel pipeline. They didnt add _any_ additional
> precision at all afaik. Which is why I wonder why we need it. I'm also not
> aware of any plans for pushing past 12bpp of data sent to the sink, but I
> honestly don't have much clue really.
> 
> I guess input is a different story, todays cmos already easily to 14bit
> with more to come I guess with all the noise about HDR. We probably need
> more headroom on v4l input side than we ever need on drm display side.
> Still 24bits is an awful lot of headroom, at least for the next few years.
> Or do you expect to hit that already soonish on v4l side?

I think 24 bits precision is enough, but that assumes that the integer part
will be between -128 and 127. And I am not so sure that that is a valid assumption.

It's true today, but what if you have a HW LUT that maps integer values and expects
16.0 or perhaps 12.4?

BTW, I am assuming that the proposed 8.24 format is a signed format: the CSC
3x3 matrices contain negative values, so any fixed point data type has to be signed.

I'm just wondering: is it really such a big deal to use a 32.32 format? Yes, the
amount of data doubles, but it's quite rare that you need to configure a LUT, right?

For a 12 bit LUT it's 16 kB vs 32 kB. Yes, it's more data, but the advantage is that
the data type is future proof (well, probably :-) ) and much more likely to be usable
in other subsystems.

>> I would guess that a LUT supporting 16 bit color components would need a precision
>> of 0.20 or so (assuming the resulting values are used in further calculations).
>>
>> High dynamic range video will be an important driving force towards higher bit depths
>> and accurate color handling, so you can expect to see this become much more important
>> in the coming years.
>>
>> And as I mentioned another consideration is that this fixed point data type might
>> be useful elsewhere in the kernel where you need to do some precision arithmetic.
>> So using a standard type that anyone can use with functions in lib/ to do basic
>> operations can be very useful indeed beyond just DRM and V4L2.
> 
> 0.20 would still comfortably fit into 8.24. And yeah worst-case (in 10
> years or so) we need to add a high-bpp variant if it really comes to it.

I think this is much closer than you think. I agree that you are not likely to see
this soon for consumer graphics cards, but for professional equipment and high-end
consumer electronics this is another story.

And if it is being done for input, then output will need it as well: after all,
what's the point of 16-bit color components if you can't display it? Whether Intel
will support it is another matter, but there are other vendors, you know... :-)

Regards,

	Hans

> But for the next few years (and that's the pace at which we completely
> rewrite gfx drivers anyway) I don't see anything going past .24. .16 was
> indeed a bit too little I think, which is why we decided to move the fixed
> point a bit.
> -Daniel
> 
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel