On Thu, Nov 7, 2019 at 10:09 PM Dmitry Sepp <dmitry.sepp@xxxxxxxxxxxxxxx> wrote: > > Hello Gerd, > > Thank you for your feedback. > > There is no relationship between those. As I mentioned earlier. we have also > been working on a virtio video device at the same time. And there is no > relationship between the two specs. > > I can point you to the differences I see: > > virtio-vdec: > 1. Both the device and the driver submit requests to each other. For each > request the response is sent as a separate request. To be more precise, in vdec there are no responses. The guest sends commands to the host using one virtqueue. The host signals asynchronous events, which might not have the exact earlier guest request associated to them. An example of such special case could be H.264 framebuffer reordering, where one might end up with a few decode requests not resulting in any frames being output and then one decode request that would trigger multiple accumulated frames to be returned. > 2. No support for getting/setting video stream parameters. For example > (decoder): output format (NV12, I420), so the driver cannot really select the > output format after headers have been parsed. Getting video stream parameters is there, but they are currently left fully in control of the host video decoder. Ability to select between multiple possible formats could be worth adding, though. > 3. No support for getting plane requirements from the device (sg vs contig, > size, stride alignment, plane count). There is actually a bigger difference that results in that. Vdec assumes host-allocated buffers coming from a different device, e.g. virtio-gpu and the host having the right knowledge to allocate the buffers correctly. This is related to the fact that it's generally difficult to convey all the allocation constraints in a generic manner. > 4. In the vdec device Drain and Flush are not separate for each buffer queue. > So seek and dynamic resolution change (adaptive playback) cannot be > implemented because 'flush' can have different meaning for a resolution change > and a seek. That's not true. Drain and flush can be defined very precisely for a stateful video decoder. For example, V4L2 defines drain as follows: https://www.kernel.org/doc/html/latest/media/uapi/v4l/dev-decoder.html#drain and it's modeled after that in vdec. There is no flush explicitly defined in V4L2, but that corresponds to the behavior of STREAMOFF, which drops all the buffers on given queue. There is no practical use for flushing just one queue in case of a video decoder, so we decided to simplify it down to a single flush that fully resets the decoder, which is useful for instantaneous seek. There is also already infrastructure existing for dynamic resolution change too, using the stream information host request and drain flow, which is very similar to how this is done in V4L2: https://www.kernel.org/doc/html/latest/media/uapi/v4l/dev-decoder.html#dynamic-resolution-change > 5. Decoder only: new devices will be needed to support encoder, processor or > capture. Currently input is always a bitstream, output is always a video > frame. No way set input format (needed for encoder, see 2). The rationale for this was that this is a point of contact with the host and a possible attack surface, so having a protocol as specific as possible makes the attack surface smaller and is easier to validate in the device implementation. > > virtio-video: > 1. Uses the 'driver requests - device responses' model. > 2. Does not have the logical split of bitstreams and framebuffers, has only a > generic buffer object. > 3. Generic: can support any type of video device right away due to flexibility > of stream configuration. Both input and output buffer queues can accept > bitstream and frame buffers and run independently. (more controls need to be > defined for e.g. camera) To fully support real cameras, not just simple UVC webcams, some mechanism to have multiple output and capture streams synchronized together would be needed, because Android Camera HALv3 heavily relies on multiple stream support. For example, a simple camera application with ZSL (zero shutter lag) could setup following streams: 1) YUV preview 2) RAW capture 3) RAW output 4) JPEG capture 1) and 2) would operate at camera frame rate, while 3) and 4) would be given on demand whenever the user presses the shutter button. Presence of 3) and 4) must not affect 1) and 2), i.e. the preview and raw capture must continue at camera frame rate. > 4. Supports seek, drain, dynamic resolution change using an API agnostic > command set (no v4l2/vaapi or so on remoting). > 5. Complex configuration (most likely cannot be simplified for such a complex > device type). > > Best regards, > Dmitry. > > On Donnerstag, 7. November 2019 10:56:57 CET Gerd Hoffmann wrote: > > On Tue, Nov 05, 2019 at 08:19:19PM +0100, Dmitry Sepp wrote: > > > [Resend after fixing an issue with the virtio-dev mailing list] > > > > > > This patch proposes a virtio specification for a new virtio video > > > device. > > > > Hmm, quickly looking over this, it looks simliar to the vdec draft > > posted a few weeks ago, with other device variants added but not > > fully specified yet. > > > > So, can you clarify the relationship between the two? > > > > thanks, > > Gerd > >