The original Media Mini Summit report can be found here: https://lore.kernel.org/linux-media/02ad4845-d7d8-d95c-ae7e-3229f1dc86b3@xxxxxxxxx/T/#u It mentioned that the report about the Kernel CAM topic would be posted later, so I'm pleased to finally post that. Thanks to all who participated in the meetings and the reviewing of this report. Kernel CAM (Ricardo Ribalda) ============================ Slides at https://drive.google.com/file/d/1Tew21xeKmFlQ7dQxMcIYqybVuQL7La1a/view Kernel Recipes talk about Kernel CAM: https://kernel-recipes.org/en/2022/talks/rethinking-the-kernel-camera-framework/ This discussion spanned two days: Monday morning on September 16th as part of the Media Mini-Summit Dublin, followed by another meeting in a smaller group on Tuesday morning. Many thanks to Google for organizing a room for us on Tuesday. On Monday morning the discussion was specifically around the Kernel CAM proposal, on Tuesday morning the discussion was more about how to proceed. This report attempts to capture the discussions of those two days, but does not necessarily reflect any final agreement between the parties involved. The report ends with two final comments from myself as submaintainer of the media subsystem (V4L2 in particular), and Mauro as maintainer of the media subsystem. Attendees --------- Sakari Ailus <sakari.ailus@xxxxxxxxxxxxxxx> Kieran Bingham <kieran.bingham@xxxxxxxxxxxxxxxx> Mauro Carvalho Chehab <mchehab@xxxxxxxxxx> (Remote, Monday only) Nicolas Dufresne <nicolas@xxxxxxxxxxxx> Hugues Fruchet <hugues.fruchet@xxxxxx> Benjamin Gaignard <benjamin.gaignard@xxxxxxxxxxxxx> Jacopo Mondi <jacopo@xxxxxxxxxx> Benjamin MUGNIER <benjamin.mugnier@xxxxxxxxxxx> (Remote, Monday only) Michael Olbrich <m.olbrich@xxxxxxxxxxxxxx> Laurent Pinchart <laurent.pinchart@xxxxxxxxxxxxxxxx> Ricardo Ribalda <ribalda@xxxxxxxxxxxx> Maxime Ripard <maxime@xxxxxxxxxx> (Monday only) Daniel Scally <djrscally@xxxxxxxxx> (Monday only) Jernej Škrabec <jernej.skrabec@xxxxxxxxx> (Monday only) Niklas Söderlund <niklas.soderlund@xxxxxxxxxxxx> (Monday only) Dave Stevenson <dave.stevenson@xxxxxxxxxxxxxxx> Michael Tretter <m.tretter@xxxxxxxxxxxxxx> (Monday only) Chen-Yu Tsai <wenst@xxxxxxxxxxxx> (Monday only) Hans Verkuil <hverkuil@xxxxxxxxx> Monday ------ In Android, the vendor is responsible for implementation, between hardware and Android camera HAL you can have anything (e.g. user space drivers). ChromeOS currently supports several SoC vendors (Mediatek, Rockchip, Intel, Qualcomm, more to come) and expects that the kernel code is upstream. Documenting ISP algorithm parameters reveals information from image processing step internal implementation and vendors are not willing to do so. Can a driver be in upstream while being only usable by the original vendor? Even if it offers enough functionality to be used by anyone and only leaves un-documented a small percentage of advanced features? Vendors may be unwilling to reveal tuning parameters they use. Documentation requirements + difficulty in getting new features accepted to V4L2 are the (main?) reasons for downstream V4L2-alike drivers. Maxime notes that the slide about Vendors' Complaints has the same arguments used by GPU vendors against upstreaming drivers. API documentation is required for upstreaming. ISPs have traditionally been a pain point upstream-wise. There is more push to get these in upstream due to ChromeOS upstreaming requirement, and due to user-visible problems (e.g. IPU6). Also laptop vendors are unhappy with no upstream driver support for their computers. V4L2 more a technically limiting factor than before, e.g. 96 DMAs in one ISP. Nicolas and Hans noted that downsteam GPL drivers are fine. Upstreaming a driver requires opening up the driver interfaces. There appears to be near-unanimous consensus on this (apart from Ricardo). Ricardo said that he rather have a driver that supports all the standard use-cases than no driver at all. He is ok giving access to the vendor to the "advanced" cores for a proprietary userspace, if that core has no access to memory. If a vendor wishes to upstream a driver with partial functionality, that's fine. A downstream variant of this driver may support more functionality (that is not documented). Ricardo's preference would be to document only certain basic features and leave the rest undocumented. Vendors need to plan for upstreaming and understand requirements of getting drivers to upstream. Implementing Android camera HAL API 3 full profile entirely possible using V4L2. The Request API is not very helpful here, not used by ISP drivers. What developments API-wise do we expect in this area? Nokia FCAM and then Android camera HAL 3 introduced the concept of a request that binds the frame and the parameters. No major new developments since that, in almost 10 years. Minor changes: - Changing parameters from registers to buffers - New pixel formats If a new API for ISPs that is upstream would be developed, then that would be what we use for the next two decades. Sakari: V4L2 is not an impossible API for IPU6, just a poor one. Hans: V4L2 could be amended with support for disconnecting relation with DMAs and video devices. I.e., doing this directly through the media controller instead of having to deal with zillions of device nodes. Mauro: discussion with vendors' ISP teams would be useful. A short discussion of the proposed Kernel CAM API followed: - 3 concepts: * Entities: represent a hardware node and have: properties and events. Properties do not need to map hardware registers 1:1. * Operation: A set of read and write to an entity. Multiple can be queued at the same time and can be scheduled after one or more triggers: Events, Fences, Operation Completion, timers * Single device node, /dev/cam, to control all the entities to queue operation and get the response from them. More information in the presentation linked to at the top. - The interface looks like a register address space - Per-driver interface - These registers do not necessary map to hardware registers - Hans: This will not go upstream if it allows vendor-only access to parts of the hardware. - Many newer ISPs have firmware the driver talks to - Plan to support real hardware soon - In an offline discussion Mauro noted that 'CAM' is used in DVB where it stands for "Conditional Access Module", and so "Kernel CAM" is something that should be renamed if we decide to continue with this. Some conclusions from this discussion ------------------------------------- Everyone agrees that the current V4L2 API is not very suitable for the current generation of ISPs: it is too cumbersome. We would be happy to work with ChromeOS and/or vendors to attempt to improve it. The proposed Kernel CAM API is considered much too vague to comment on. An example driver for real hardware will be needed first. A register level interface is too low level interface for a complex kernel driver. A number of newer ISPs implement a (slightly) higher level message-passing interface towards software rather than something that would provide more direct control of hardware such as what traditionally is meant by a register interface. For upstreaming ISP drivers (regardless of the API used), multiple options have been discussed, without any decision made yet regarding which options would be accepted by the community: 1) A fully open source upstream driver. Does require opening up hardware access and documenting the API, but not the algorithms used to configure the HW optimally, those can remain closed. However, enough information must be available so an open source implementation can be made. Firmware may be used for low level control of the hardware. 2) A basic upstream driver that gives a reasonable quality picture. The BSP can be used if the vendor-specific camera control algorithms are needed to improve quality. The BSP would add those missing pieces to the upstream basic driver. 3) The upstream driver talks to vendor-provided firmware instead of directly to the hardware: the camera control algorithms are in the FW under control of the vendor. This does lock in the hardware to a specific use-case, optimized for in the FW. 4) As an alternative of fully documenting the driver API, an Open Source user space implementation can be provided. The driver UAPI needs to be more or less fully used and in practice this would involve libcamera. Note that this would be similar to DRM requirements; V4L2/MC are effectively not an application API for these devices. (This isn't meant to be seen as a fully articulated list of requirements as that part wasn't discussed in the meeting.) And of course, if no agreement can be reached on any of the options above, then we can keep everything out-of-tree. Tuesday ------- The main focus on the second day was on identifying the main painpoints and agree on a list of requirements that any new A{O should fulfill. ## Problems with existing V4L2-based camera stacks Ricardo initially went through the problems that vendors and ChromeOS is having with V4L2: ### Problems from vendors Vendors want to be able to use a kernel API that is respectful of their IP: * Protect investment * Licencing is simple * Do not have to release their 3A algorithms (external algorithms) * Do not have to release how their imaging blocks are implemented (internal algorithms) Today the upstream community allows vendors to ship a closed source implementation of 3A algorithms, provided that an open source implementation can also be made. There are no plans to change this policy. Vendors want to leverage their investment in Android. And V4L2 does not map 1:1 with HAL3. ### Problems from Chrome OS ChromeOS has a Upstream First commitment, but also wants to enable as much hardware as possible to as many users as possible. ChromeOS has provided a lot of resources in the past to upstream vendor code, but this is not scalable and cannot be done at the speed that is required by the market. Today downstream drivers can access features not available to the open-source stack. ChromeOS thinks that there should be a mechanism to access vendor features on the open-source stack assuming that the open source stack is feature complete. ## Looking at the future We listed a set of requirements that a kernel API should have to support ISP based cameras. The goal is not to replace V4L2 for every single type of device. * Allowing different application-facing APIs to be built on top of the stack, e.g.: - libcamera - Android - Industrial APIs * The kernel API shouldn't be designed as an application API, but require a middleware (e.g. Mesa for graphics, libcamera) * Vendors must be involved * Vendors work (with) upstream * IP protection * It's a vendor requirement, no particular requirement from Chrome OS * Overhead of the API should be low * Limiting the number of context switches or system calls * Flexible memory management (alloc/free individual buffers at runtime, buffers of different sizes, ...) * Separating management of buffers from configuration of the device * Support interoperability with other devices (codecs, display, accelerators, ...) * Memory buffer constraints (possibly using the Unix Device Memory Allocator, if it ever gets finalized) * dmabuf * Fences * Atomic actions * Security with an untrusted userspace * No access to foreign memory * No ill side-effects on any part of the system outside of the camera pipeline * No physical damage to the hardware * Can lock up the camera but needs to be recoverable by software using the camera API only * Images can be corrupted, that is accepted * IP disclosure * Every driver upstream must be posted with an open-source userspace that showcases the API, in libcamera, Android camera HAL or another Camera Stack released under an open source license. This must be compilable and testable by the community on available hardware using only open-source software (e.g. if the implementation targets Android, it would need to be compilable and testable on a device using AOSP, not a vendor binary Android release). * The open-source user space can have closed-source 3A algorithms, but it must be possible to develop an open-source implementation of the 3A algorithms that has access to the exact same device features as the closed-source version. * Closed-source userspace is supported, with the limitation that it has access to the same features as the open-source userspace. * Modularity of components in the camera pipeline * Camera sensor drivers must interoperate with any ISP driver that supports the same camera sensor bus. Final comments from Hans and Mauro ================================== These comments are added to clearly state the view of us as media maintainers. Hans (as media submaintainer): It is *not* an option to upstream a driver that has support for undocumented closed features. Basically maintainers can't put their name on something that contains unverifiable (for them) and unusable (by all except the vendor) features. Mauro (as media maintainer): My view as media maintainer is similar to Hans: the non-firmware code required for the camera to work (either Kernelspace or userspace) should be open-sourced, up to the point that video streams/images are properly captured on a non-proprietary video output format, in a way that it would be possible to use open-source implementations of 3A algorithms if someone is willing to do that. So, registers used for image enhancement and meta-data should be properly documented. This is specially important since, usually, chipset maintainers stop implementing features when newer generations of the hardware arrive, keeping users of the (not so) old hardware without decent support. With proper documentation, if enough developers want to keep their hardware working, they could implement their replacement code to the proprietary code if needed.