Re: [PATCH v4 2/2] media: cedrus: Add H264 decoding support

Jernej Škrabec <jernej.skrabec@xxxxxxxxx> · Wed, 06 Mar 2019 19:17:49 +0100

Dne sreda, 06. marec 2019 ob 11:57:08 CET je Maxime Ripard napisal(a):
> Hi,
> 
> On Tue, Mar 05, 2019 at 06:05:08PM +0100, Jernej Škrabec wrote:
> > Dne torek, 05. marec 2019 ob 11:17:32 CET je Maxime Ripard napisal(a):
> > > Hi Jernej,
> > > 
> > > On Wed, Feb 20, 2019 at 06:50:54PM +0100, Jernej Škrabec wrote:
> > > > I really wanted to do another review on previous series but got
> > > > distracted
> > > > by analyzing one particulary troublesome H264 sample. It still doesn't
> > > > work correctly, so I would ask you if you can test it with your stack
> > > > (it
> > > > might be userspace issue):
> > > > 
> > > > http://jernej.libreelec.tv/videos/problematic/test.mkv
> > > > 
> > > > Please take a look at my comments below.
> > > 
> > > I'd really prefer to focus on getting this merged at this point, and
> > > then fixing odd videos and / or setups we can find later
> > > on. Especially when new stacks are going to be developped on top of
> > > this, I'm sure we're going to have plenty of bugs to address :)
> > 
> > I forgot to mention, you can add:
> > Reviewed-by: Jernej Skrabec <jernej.skrabec@xxxxxxxx>
> > 
> > once you fix issues.
> 
> Great, thanks :)
> 
> > > > > +	for (i = 0; i < ARRAY_SIZE(pred_weight->weight_factors); i++) {
> > > > > +		const struct v4l2_h264_weight_factors *factors =
> > > > > +			&pred_weight->weight_factors[i];
> > > > > +
> > > > > +		for (j = 0; j < ARRAY_SIZE(factors->luma_weight); j++)
> > 
> > {
> > 
> > > > > +			u32 val;
> > > > > +
> > > > > +			val = ((factors->luma_offset[j] & 0x1ff) <<
> > 
> > 16)
> > 
> > > > > +				(factors->luma_weight[j] & 
0x1ff);
> > > > > +			cedrus_write(dev, VE_AVC_SRAM_PORT_DATA,
> > > > 
> > > > val);
> > > > 
> > > > You should cast offset varible to wider type. Currently some videos
> > > > which
> > > > use prediction weight table don't work for me, unless offset is casted
> > > > to
> > > > u32 first. Shifting 8 bit variable for 16 places gives you 0 every
> > > > time.
> > > 
> > > I'll do it.
> > > 
> > > > Luma offset and weight are defined as s8, so having wider mask doesn't
> > > > really make sense. However, I think weight should be s16 anyway,
> > > > because
> > > > standard says that it's value could be 2^denominator for default value
> > > > or
> > > > in range -128..127. Worst case would be 2^7 = 128 and -128. To cover
> > > > both
> > > > values you need at least 9 bits.
> > > 
> > > But if I understood the spec right, in that case you would just have
> > > the denominator set, and not the offset, while the offset is used if
> > > you don't use the default formula (and therefore remains in the -128
> > > 127 range which is covered by the s8), right?
> > 
> > Yeah, default offset is 0 and s8 is sufficient for that. I'm talking about
> > weight. Default weight is "1 << denominator", which might be 1 << 7 or
> > 128.
> > 
> > We could also add a flag, which would signal default table. In that case
> > we
> > could just set a bit to tell VPU to use default values. Even if some VPUs
> > need default table to be set explicitly, it's very easy to calculate
> > values as mentioned in previous paragraph.
> 
> Yeah, sorry, I meant weight. Would that make any difference? Can we
> have situations where both the denominator and the weight would be
> set, with the weight set to 128?

Yes, that's the case when default weight is used and (log2) denominator is 7. 
Weight is then "1 << denominator".

> 
> I've checked in the libva and ffmpeg, and libva uses a short, while
> ffmpeg uses an int, both for the weight and offset. For consistency I
> guess we could change both to shorts just like libva?
> 
> What do you think?

Yes, that would work for me. Maybe someone else has opinion about this? But I 
think there is a reason why libva uses shorts.

Best regards,
Jernej