I'm also CC'ing the linux-media@xxxxxxxxxxxxxxx mailing list for these discussions, I'm sure there are folks there who are interested in codec and camera virtualization. On Sat, May 06, 2023 at 11:12:29AM +0300, Laurent Pinchart via libcamera-devel wrote: > On Fri, May 05, 2023 at 04:55:33PM +0100, Alex Bennée via libcamera-devel wrote: > > Kieran Bingham writes: > > > Quoting Alexander Gordeev (2023-05-05 10:57:29) > > >> On 03.05.23 17:53, Cornelia Huck wrote: > > >> > On Wed, May 03 2023, Alex Bennée <alex.bennee@xxxxxxxxxx> wrote: > > >> >> Cornelia Huck <cohuck@xxxxxxxxxx> writes: > > >> >>> On Fri, Apr 28 2023, Alexander Gordeev <alexander.gordeev@xxxxxxxxxxxxxxx> wrote: > > >> >>>> On 27.04.23 15:16, Alexandre Courbot wrote: > > >> >>>>> But in any case, that's irrelevant to the guest-host interface, and I > > >> >>>>> think a big part of the disagreement stems from the misconception that > > >> >>>>> V4L2 absolutely needs to be used on either the guest or the host, > > >> >>>>> which is absolutely not the case. > > >> >>>> > > >> >>>> I understand this, of course. I'm arguing, that it is harder to > > >> >>>> implement it, get it straight and then maintain it over years. Also it > > >> >>>> brings limitations, that sometimes can be workarounded in the virtio > > >> >>>> spec, but this always comes at a cost of decreased readability and > > >> >>>> increased complexity. Overall it looks clearly as a downgrade compared > > >> >>>> to virtio-video for our use-case. And I believe it would be the same for > > >> >>>> every developer, that has to actually implement the spec, not just do > > >> >>>> the pass through. So if we think of V4L2 UAPI pass through as a > > >> >>>> compatibility device (which I believe it is), then it is fine to have > > >> >>>> both and keep improving the virtio-video, including taking the best > > >> >>>> ideas from the V4L2 and overall using it as a reference to make writing > > >> >>>> the driver simpler. > > >> >>> > > >> >>> Let me jump in here and ask another question: > > >> >>> > > >> >>> Imagine that, some years in the future, somebody wants to add a virtio > > >> >>> device for handling video encoding/decoding to their hypervisor. > > >> >>> > > >> >>> Option 1: There are different devices to chose from. How is the person > > >> >>> implementing this supposed to pick a device? They might have a narrow > > >> >>> use case, where it is clear which of the devices is the one that needs to > > >> >>> be supported; but they also might have multiple, diverse use cases, and > > >> >>> end up needing to implement all of the devices. > > >> >>> > > >> >>> Option 2: There is one device with various optional features. The person > > >> >>> implementing this can start off with a certain subset of features > > >> >>> depending on their expected use cases, and add to it later, if needed; > > >> >>> but the upfront complexity might be too high for specialized use cases. > > >> >>> > > >> >>> Leaving concrete references to V4L2 out of the picture, we're currently > > >> >>> trying to decide whether our future will be more like Option 1 or Option > > >> >>> 2, with their respective trade-offs. > > >> >>> > > >> >>> I'm slightly biased towards Option 2; does it look feasible at all, or > > >> >>> am I missing something essential here? (I had the impression that some > > >> >>> previous confusion had been cleared up; apologies in advance if I'm > > >> >>> misrepresenting things.) > > >> >>> > > >> >>> I'd really love to see some kind of consensus for 1.3, if at all > > >> >>> possible :) > > >> >> > > >> >> I think feature discovery and extensibility is a key part of the VirtIO > > >> >> paradigm which is why I find the virtio-v4l approach limiting. By > > >> >> pegging the device to a Linux API we effectively limit the growth of the > > >> >> device specification to as fast as the Linux API changes. I'm not fully > > >> >> immersed in v4l but I don't think it is seeing any additional features > > >> >> developed for it and its limitations for camera are one of the reasons > > >> >> stuff is being pushed to userspace in solutions like libcamera: > > >> >> > > >> >> How is libcamera different from V4L2? > > >> >> > > >> >> We see libcamera as a continuation of V4L2. One that can more easily > > >> >> handle the recent advances in hardware design. As embedded cameras have > > >> >> developed, all of the complexity has been pushed on to the developers. > > >> >> With libcamera, all of that complexity is simplified and a single model > > >> >> is presented to application developers. > > >> > > > >> > Ok, that is interesting; thanks for the information. > > >> > > > >> >> > > >> >> That said its not totally our experience to have virtio devices act as > > >> >> simple pipes for some higher level protocol. The virtio-gpu spec says > > >> >> very little about the details of how 3D devices work and simply offers > > >> >> an opaque pipe to push a (potentially propriety) command stream to the > > >> >> back end. As far as I'm aware the proposals for Vulkan and Wayland > > >> >> device support doesn't even offer a feature bit but simply changes the > > >> >> graphics stream type in the command packets. > > >> >> > > >> >> We could just offer a VIRTIO_VIDEO_F_V4L feature bit, document it as > > >> >> incompatible with other feature bits and make that the baseline > > >> >> implementation but it's not really in the spirit of what VirtIO is > > >> >> trying to achieve. > > >> > > > >> > I'd not be in favour of an incompatible feature flag, > > >> > either... extensions are good, but conflicting features is something > > >> > that I'd like to avoid. > > >> > > > >> > So, given that I'd still prefer to have a single device: How well does > > >> > the proposed virtio-video device map to a Linux driver implementation > > >> > that hooks into V4L2? > > >> > > >> IMO it hooks into V4L2 pretty well. And I'm going to spend next few > > >> months making the existing driver fully V4L2 compliant. If this goal > > >> requires changing the spec, than we still have time to do that. I don't > > >> expect a lot of problems on this side. There might be problems with > > >> Android using V4L2 in weird ways. Well, let's see. Anyway, I think all > > >> of this can be accomplished over time. > > >> > > >> > If the general process flow is compatible and it > > >> > is mostly a question of wiring the parts together, I think pushing that > > >> > part of the complexity into the Linux driver is a reasonable > > >> > trade-off. Being able to use an existing protocol is nice, but if that > > >> > protocol is not perceived as flexible enough, it is probably not worth > > >> > encoding it into a spec. (Similar considerations apply to hooking up the > > >> > device in the hypervisor.) > > >> > > >> I very much agree with these statements. I think this is how it should > > >> be: we start with a compact but usable device, then add features and > > >> enable them using feature flags. Eventually we can cover all the > > >> use-cases of V4L2 unless we decide to have separate devices for them > > >> (virtio-camera, etc). This would be better in the long term I think. > > > > > > Camera's definitely have their quirks - mostly because many usecases are > > > hard to convey over a single Video device node (with the hardware) but I > > > think we might expect that complexity to be managed by the host, and > > > probably offer a ready made stream to the guest. Of course how to handle > > > multiple streams and configuration of the whole pipeline may get more > > > difficult and warrant a specific 'virtio-camera' ... but I would think > > > the basics could be covered generically to start with. > > > > > > It's not clear who's driving this implementation and spec, so I guess > > > there's more reading to do. > > > > > > Anyway, I've added Cc libcamera-devel to raise awareness of this topic > > > to camera list. > > > > > > I bet Laurent has some stronger opinions on how he'd see camera's exist > > > in a virtio space. > > You seem to think I have strong opinions about everything. This may not > be a complitely unfounded assumption ;-) > > Overall I agree with you, I think cameras are too complex for a > low-level virtualization protocol. I'd rather see a high-level protocol > that exposes webcam-like devices, with the low-level complexity handled > on the host side (using libcamera of course ;-)). This would support use > cases that require sharing hardware blocks between multiple logical > cameras, including sharing the same camera streams between multiple > guests. > > If a guest needs low-level access to the camera, including the ability > to control the raw camera sensor or ISP, then I'd recommend passing the > corresponding hardware blocks to the guest for exclusive access. > > > Personally I would rather see a separate virtio-camera specification > > that properly encapsulates all the various use cases we have for > > cameras. In many ways just processing a stream of video is a much > > simpler use case. > > > > During Linaro's Project Stratos we got a lot of feedback from members > > who professed interest in a virtio-camera initiative. However we were > > unable to get enough engineering resources from the various companies to > > collaborate in developing a specification that would meet everyone's > > needs. The problem space is wide from having numerous black and white > > sensor cameras on cars to the full on computational photography as > > exposed by modern camera systems on phones. If you want to read more > > words on the topic I wrote a blog post at the time: > > > > https://www.linaro.org/blog/the-challenges-of-abstracting-virtio/ > > > > Back to the topic of virtio-video as I understand it the principle > > features/configurations are: > > > > - All the various CODECs, resolutions and pixel formats > > - Stateful vs Stateless streams > > - If we want support grabbing single frames from a source > > > > My main concern about the V4L approach is that it pegs updates to the > > interface to the continuing evolution of the V4L interface in Linux. Now > > maybe video is a solved problem and there won't be (m)any new features > > we need to add after the initial revision. However I'm not a domain > > expert here so I just don't know. > > I've briefly discussed "virtio-v4l2" with Alex Courbot a few weeks ago > when we got a chance to meet face to face. I think the V4L2 kernel API > is a quite good fit in the sense that its level of abstraction, when > applied to video codecs and "simple" cameras (defined, more or less, as > something ressembling a USB webcam feature-wise). It doesn't mean that > the virtio-video or virtio-camera specifications should necessarily > reference V4L2 or use the exact same vocabulary, they could simply copy > the concepts, and stay loosely-coupled with V4L2 in the sense that both > specification should try to evolve in compatible directions. -- Regards, Laurent Pinchart