On Wed, Dec 4, 2019 at 6:16 PM Gerd Hoffmann <kraxel@xxxxxxxxxx> wrote: > > Hi, > > > 1. Focus on only decoder/encoder functionalities first. > > > > As Tomasz said earlier in this thread, it'd be too complicated to support camera > > usage at the same time. So, I'd suggest to make it just a generic mem-to-mem > > video processing device protocol for now. > > If we finally decide to support camera in this protocol, we can add it later. > > Agree. > > > 2. Only one feature bit can be specified for one device. > > > > I'd like to have a decoder device and encoder device separately. > > It'd be natural to assume it because a decoder and an encoder are provided as > > different hardware. > > Hmm, modern GPUs support both encoding and decoding ... > Many SoC architectures have completely separate IP blocks for encoding and decoding. Similarly, in GPUs those are usually completely separate parts of the pipeline. > I don't think we should bake that restriction into the specification. > It probably makes sense to use one virtqueue per function though, that > will simplify dispatching in both host and guest. > Wouldn't it make the handling easier if we had one virtio device per function? [snip] > > > +\begin{lstlisting} > > > +enum virtio_video_pixel_format { > > > + VIRTIO_VIDEO_PIX_FMT_UNDEFINED = 0, > > > + > > > + VIRTIO_VIDEO_PIX_FMT_H264 = 0x0100, > > > + VIRTIO_VIDEO_PIX_FMT_NV12, > > > + VIRTIO_VIDEO_PIX_FMT_NV21, > > > + VIRTIO_VIDEO_PIX_FMT_I420, > > > + VIRTIO_VIDEO_PIX_FMT_I422, > > > + VIRTIO_VIDEO_PIX_FMT_XBGR, > > > +}; > > > > I'm wondering if we can use FOURCC instead. So, we can avoid reinventing a > > mapping from formats to integers. > > Also, I suppose the word "pixel formats" means only raw (decoded) formats. > > But, it can be encoded format like H.264. So, I guess "image format" or > > "fourcc" is a better word choice. > > Use separate pixel_format (fourcc) and stream_format (H.264 etc.) enums? > I'd specifically avoid FourCC here, as it's very loosely defined and could introduce confusion. A separate enum for "image formats", including both sounds good to me. > > > +\begin{lstlisting} > > > +struct virtio_video_function { > > > + struct virtio_video_desc desc; > > > + __le32 function_type; /* One of VIRTIO_VIDEO_FUNC_* types */ > > > + __le32 function_id; > > > + struct virtio_video_params in_params; > > > + struct virtio_video_params out_params; > > > + __le32 num_caps; > > > + __u8 padding[4]; > > > + /* Followed by struct virtio_video_capability video_caps[]; */ > > > +}; > > > +\end{lstlisting} > > > > If one device only has one functionality, virtio_video_function's fields will be > > no longer needed except in_params and out_params. So, we'd be able to remove > > virtio_video_function and have in_params and out_params in > > virtio_video_capability instead. > > Same goes for per-function virtqueues (used virtqueue implies function). > > > > +\begin{lstlisting} > > > +struct virtio_video_resource_detach_backing { > > > + struct virtio_video_ctrl_hdr hdr; > > > + __le32 resource_id; > > > + __u8 padding[4]; > > > +}; > > > +\end{lstlisting} > > > + > > > +\begin{description} > > > +\item[\field{resource_id}] internal id of the resource. > > > +\end{description} > > > > I suppose that it'd be better not to have the above series of T_RESOURCE > > controls at least until we reach a conclusion in the thread of buffer-sharing > > device. If we end up concluding this type of controls is the best way, we'll be > > able to revisit here. > > Well. For buffer management there are a bunch of options. > > (1) Simply stick the buffers (well, pointers to the buffer pages) into > the virtqueue. This is the standard virtio way. > > (2) Create resources, then put the resource ids into the virtqueue. > virtio-gpu uses that model. First, because virtio-gpu needs an id > to reference resources in the rendering command stream > (virtio-video doesn't need this). Also because (some kinds of) > resources are around for a long time and the guest-physical -> > host-virtual mapping needs to be done only once that way (which > I think would be the case for virtio-video too because v4l2 > re-uses buffers in robin-round fashion). Drawback is this > assumes shared memory between host and guest (which is the case > in typical use cases but it is not mandated by the virtio spec). > > (3) Import external resources (from virtio-gpu for example). > Out of scope for now, will probably added as optional feature > later. > > I guess long-term we want support either (1)+(3) or (2)+(3). What this protocol has been proposing is a twist of (1), where there is a "resource create" call that generates a local "resource ID" for the given list of guest pages. I think that's a sane approach, given that the number of pages to describe a buffer worth of 4K video would be more than 3000. We don't want to send so long lists of pages every frame, especially since we normally recycle the buffers. Best regards, Tomasz