Sent from my iPad > On Jan 30, 2019, at 5:41 AM, Nicolas Dufresne <nicolas@xxxxxxxxxxxx> wrote: > >> Le mardi 29 janvier 2019 à 16:44 +0900, Alexandre Courbot a écrit : >> On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski >> <paul.kocialkowski@xxxxxxxxxxx> wrote: >>> Hi, >>> >>>> On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote: >>>> Sent from my iPad >>>> >>>>> On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@xxxxxxxxxxx> wrote: >>>>> >>>>> Hi, >>>>> >>>>>> On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote: >>>>>> I forget a important thing, for the rkvdec and rk hevc decoder, it would >>>>>> requests cabac table, scaling list, picture parameter set and reference >>>>>> picture storing in one or various of DMA buffers. I am not talking about >>>>>> the data been parsed, the decoder would requests a raw data. >>>>>> >>>>>> For the pps and rps, it is possible to reuse the slice header, just let >>>>>> the decoder know the offset from the bitstream bufer, I would suggest to >>>>>> add three properties(with sps) for them. But I think we need a method to >>>>>> mark a OUTPUT side buffer for those aux data. >>>>> >>>>> I'm quite confused about the hardware implementation then. From what >>>>> you're saying, it seems that it takes the raw bitstream elements rather >>>>> than parsed elements. Is it really a stateless implementation? >>>>> >>>>> The stateless implementation was designed with the idea that only the >>>>> raw slice data should be passed in bitstream form to the decoder. For >>>>> H.264, it seems that some decoders also need the slice header in raw >>>>> bitstream form (because they take the full slice NAL unit), see the >>>>> discussions in this thread: >>>>> media: docs-rst: Document m2m stateless video decoder interface >>>> >>>> Stateless just mean it won’t track the previous result, but I don’t >>>> think you can define what a date the hardware would need. Even you >>>> just build a dpb for the decoder, it is still stateless, but parsing >>>> less or more data from the bitstream doesn’t stop a decoder become a >>>> stateless decoder. >>> >>> Yes fair enough, the format in which the hardware decoder takes the >>> bitstream parameters does not make it stateless or stateful per-se. >>> It's just that stateless decoders should have no particular reason for >>> parsing the bitstream on their own since the hardware can be designed >>> with registers for each relevant bitstream element to configure the >>> decoding pipeline. That's how GPU-based decoder implementations are >>> implemented (VAAPI/VDPAU/NVDEC, etc). >>> >>> So the format we have agreed on so far for the stateless interface is >>> to pass parsed elements via v4l2 control structures. >>> >>> If the hardware can only work by parsing the bitstream itself, I'm not >>> sure what the best solution would be. Reconstructing the bitstream in >>> the kernel is a pretty bad option, but so is parsing in the kernel or >>> having the data both in parsed and raw forms. Do you see another >>> possibility? >> >> Is reconstructing the bitstream so bad? The v4l2 controls provide a >> generic interface to an encoded format which the driver needs to >> convert into a sequence that the hardware can understand. Typically >> this is done by populating hardware-specific structures. Can't we >> consider that in this specific instance, the hardware-specific >> structure just happens to be identical to the original bitstream >> format? > > At maximum allowed bitrate for let's say HEVC (940MB/s iirc), yes, it Lucky, most of hardware won’t be able to processing such a big buffer. General speaking, the register is 24bits for stream length in bytes. > would be really really bad. In GStreamer project we have discussed for > a while (but have never done anything about) adding the ability through > a bitmask to select which part of the stream need to be parsed, as > parsing itself was causing some overhead. Maybe similar thing applies, > though as per our new design, it's the fourcc that dictate the driver > behaviour, we'd need yet another fourcc for drivers that wants the full > bitstream (which seems odd if you have already parsed everything, I > think this need some clarification). > >> >> I agree that this is not strictly optimal for that particular >> hardware, but such is the cost of abstractions, and in this specific >> case I don't believe the cost would be particularly high?