On Wed, Jan 30, 2019 at 12:35:41PM +0900, Tomasz Figa wrote: > On Wed, Jan 30, 2019 at 11:29 AM Alexandre Courbot > <acourbot@xxxxxxxxxxxx> wrote: > > > > On Wed, Jan 30, 2019 at 6:41 AM Nicolas Dufresne <nicolas@xxxxxxxxxxxx> wrote: > > > > > > Le mardi 29 janvier 2019 à 16:44 +0900, Alexandre Courbot a écrit : > > > > On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski > > > > <paul.kocialkowski@xxxxxxxxxxx> wrote: > > > > > Hi, > > > > > > > > > > On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote: > > > > > > Sent from my iPad > > > > > > > > > > > > > On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@xxxxxxxxxxx> wrote: > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote: > > > > > > > > I forget a important thing, for the rkvdec and rk hevc decoder, it would > > > > > > > > requests cabac table, scaling list, picture parameter set and reference > > > > > > > > picture storing in one or various of DMA buffers. I am not talking about > > > > > > > > the data been parsed, the decoder would requests a raw data. > > > > > > > > > > > > > > > > For the pps and rps, it is possible to reuse the slice header, just let > > > > > > > > the decoder know the offset from the bitstream bufer, I would suggest to > > > > > > > > add three properties(with sps) for them. But I think we need a method to > > > > > > > > mark a OUTPUT side buffer for those aux data. > > > > > > > > > > > > > > I'm quite confused about the hardware implementation then. From what > > > > > > > you're saying, it seems that it takes the raw bitstream elements rather > > > > > > > than parsed elements. Is it really a stateless implementation? > > > > > > > > > > > > > > The stateless implementation was designed with the idea that only the > > > > > > > raw slice data should be passed in bitstream form to the decoder. For > > > > > > > H.264, it seems that some decoders also need the slice header in raw > > > > > > > bitstream form (because they take the full slice NAL unit), see the > > > > > > > discussions in this thread: > > > > > > > media: docs-rst: Document m2m stateless video decoder interface > > > > > > > > > > > > Stateless just mean it won’t track the previous result, but I don’t > > > > > > think you can define what a date the hardware would need. Even you > > > > > > just build a dpb for the decoder, it is still stateless, but parsing > > > > > > less or more data from the bitstream doesn’t stop a decoder become a > > > > > > stateless decoder. > > > > > > > > > > Yes fair enough, the format in which the hardware decoder takes the > > > > > bitstream parameters does not make it stateless or stateful per-se. > > > > > It's just that stateless decoders should have no particular reason for > > > > > parsing the bitstream on their own since the hardware can be designed > > > > > with registers for each relevant bitstream element to configure the > > > > > decoding pipeline. That's how GPU-based decoder implementations are > > > > > implemented (VAAPI/VDPAU/NVDEC, etc). > > > > > > > > > > So the format we have agreed on so far for the stateless interface is > > > > > to pass parsed elements via v4l2 control structures. > > > > > > > > > > If the hardware can only work by parsing the bitstream itself, I'm not > > > > > sure what the best solution would be. Reconstructing the bitstream in > > > > > the kernel is a pretty bad option, but so is parsing in the kernel or > > > > > having the data both in parsed and raw forms. Do you see another > > > > > possibility? > > > > > > > > Is reconstructing the bitstream so bad? The v4l2 controls provide a > > > > generic interface to an encoded format which the driver needs to > > > > convert into a sequence that the hardware can understand. Typically > > > > this is done by populating hardware-specific structures. Can't we > > > > consider that in this specific instance, the hardware-specific > > > > structure just happens to be identical to the original bitstream > > > > format? > > > > > > At maximum allowed bitrate for let's say HEVC (940MB/s iirc), yes, it > > > would be really really bad. In GStreamer project we have discussed for > > > a while (but have never done anything about) adding the ability through > > > a bitmask to select which part of the stream need to be parsed, as > > > parsing itself was causing some overhead. Maybe similar thing applies, > > > though as per our new design, it's the fourcc that dictate the driver > > > behaviour, we'd need yet another fourcc for drivers that wants the full > > > bitstream (which seems odd if you have already parsed everything, I > > > think this need some clarification). > > > > Note that I am not proposing to rebuild the *entire* bitstream > > in-kernel. What I am saying is that if the hardware interprets some > > structures (like SPS/PPS) in their raw format, this raw format could > > be reconstructed from the structures passed by userspace at negligible > > cost. Such manipulation would only happen on a small amount of data. > > > > Exposing finer-grained driver requirements through a bitmask may > > deserve more exploring. Maybe we could end with a spectrum of > > capabilities that would allow us to cover the range from fully > > stateless to fully stateful IPs more smoothly. Right now we have two > > specifications that only consider the extremes of that range. > > I gave it a bit more thought and if we combine what Nicolas suggested > about the bitmask control with the userspace providing the full > bitstream in the OUTPUT buffers, split into some logical units and > "tagged" with their type (e.g. SPS, PPS, slice, etc.), we could > potentially get an interface that would work for any kind of decoder I > can think of, actually eliminating the boundary between stateful and > stateless decoders. > > For example, a fully stateful decoder would have the bitmask control > set to 0 and accept data from all the OUTPUT buffers as they come. A > decoder that doesn't do any parsing on its own would have all the > valid bits in the bitmask set and ignore the data in OUTPUT buffers > tagged as any kind of metadata. And then, we could have any cases in > between, including stateful decoders which just can't parse the stream > on their own, but still manage anything else themselves, or stateless > ones which can parse parts of the stream, like the rk3399 vdec can > parse the H.264 slice headers on its own. > > That could potentially let us completely eliminate the distinction > between the stateful and stateless interfaces and just have one that > covers both. > > Thoughts? If we have to provide the whole bitstream in the buffers, then it entirely breaks the sole software stack we have running and working currently, for a use case and a driver that hasn't seen a single line of code. Seriously, this is a *private* API that we did that way so that we can change it and only make it public. Why not do just that? Maxime -- Maxime Ripard, Bootlin Embedded Linux and Kernel engineering https://bootlin.com
Attachment:
signature.asc
Description: PGP signature