Report of the V4L2 brainstorm meeting October 10-11 2016, Berlin Attendees: Sakari Ailus <sakari.ailus@xxxxxx> Kieran Bingham <kieran@xxxxxxxxxxx> Lars-Peter Clausen <lars@xxxxxxxxxx> Guennadi Liakhovetski <g.liakhovetski@xxxxxx> Pawel Osciak <pawel@xxxxxxxxxx> Laurent Pinchart <laurent.pinchart@xxxxxxxxxxxxxxxx> Maxime Ripard <maxime.ripard@xxxxxxxxxxxxxxxxxx> Hans Verkuil <hverkuil@xxxxxxxxx> The raw etherpad notes are available here: https://hverkuil.home.xs4all.nl/v4l2-requests-2016.html If I misinterpreted something (perfectly possible), then please reply with a correction! Goal of the brainstorm meeting: Drivers for stateless hardware codecs rely on the Request API, which has still not been merged. As a result a bunch of drivers remain out of tree. The purpose of the meeting is to determine what the factors are that prevent the Request API from being merged and how that can be resolved. Note: ChromeOS uses an older version of the Request API patch series. At that time it was called 'Configuration Store API', but other than the different name it is pretty much the same API. 1. Request API background The request API is a way of associating configuration data with buffers. The original version associated controls with buffers: https://lwn.net/Articles/641204/ This functionality is required to support stateless HW codecs where the controls are used to pass the state (as maintained by userspace) to the hardware together with the buffer that has to be en/decoded. Android's cameraHAL v3 needs similar functionality, but there you also want to associate formats, selection rectangles and topology changes to the request. A version of the Request API doing that is here: https://lwn.net/Articles/688585/ However, that work doesn't contain the code to associate controls with buffers. So we have two versions: one suitable for codecs, one more geared towards cameraHAL v3. But we need both. 1.1 Request IDs The original patch series use integer IDs to identify requests. Pawel suggested that we use file descriptors instead (similar to dmabuf) to keep track of ownership and better permission handling. This was agreed to since this avoids the risk of having request objects being hijacked by other applications. 1.2 Allocation of the request FD Example with a device that has one media node and two video nodes where the sensor sends video to one video node and e.g. statistics to another video node: open("media0") -> fd10 open("video0") -> fd11 (capture queue) open("video1") -> fd12 (capture queue) ioctl(fd10, REQUEST, ALLOC) -> fd13 ioctl(fd11, QBUF, { buffer1, fd13 }) ioctl(fd12, QBUF, { buffer2, fd13 }) ioctl(fd10, REQUEST, QUEUE { fd13 }) While it makes sense to allocate the request via the media device node in most cases, there was a lot of discussion if an exception should be made for mem2mem devices such as hardware codecs. M2M devices today consist of a single video node, and every time it is opened a new instance is created (as if a new virtual m2m device was created). The driver then schedules hardware access for the various instances. This means that the request should be associated with a specific m2m instance (i.e. file descriptor). The easiest way to do that is simply by allocating the request via the video device instead of creating a new media device and come up with a way to associate the request with a m2m instance. Note that today's out-of-tree implementation *does* create the request via the video node. A final decision on this was postponed until we have a working implementation and have a better idea of the pros and cons of creating such an exception. When allocating a request the framework will fill in the request state: this will either be a copy of the current hardware state, or it will clone the state of an existing request (it does this by passing the fd of the request that should be cloned, or by passing 0 to copy the current state). 1.3 Request validation Some request requirements: - When you queue a request it is validated. - The request has a full state, so the validation is independent of any previously queued requests. - If an error occurs, no good error reporting is available. But the only reason for such errors would be hardware failures, which usually indicate a serious unrecoverable problem and should trigger a halt (requests cancelled, flagged with error) - Change in hardware configuration (e.g. disconnecting an HDMI input) could also invalidate requests 1.4 Completing requests - A request is completed when all buffers are dequeued and all results are copied to the request. - An event of some sort is needed to signal that the request has been completed. Since the request is a FD, this can be used with poll() to signal this. - To delete a request the application has to call close(). The request will be really deleted once the refcount goes to 0. - A request cannot be re-used without re-initializing it first by cloning the current HW state or an existing request or by 'cloning' itself, in which case the request's configuration will be re-used. This last option is useful for codecs: this is how the current out-of-tree implementation works. 1.5 Requests vs. w/o requests There are three options for drivers w.r.t. the request API: 1) The driver doesn't use the request API 2) The driver requires the request API 3) The request API is optional. It is not clear at this stage if 3 is ever needed. So for now just add a capability flag for "requests required". And add a "requests supported with legacy" flag later on if needed. 1.6 Minimum stateless codec requirements - Request allocation - Request capability flag - Associate controls and buffers with requests 1.7 Stateless codec behaviour - Each OUTPUT buffer produces one CAPTURE buffer of data. - The user is responsible for providing just the right amount of data to the decoder. - Less data than required does produce a CAPTURE buffer but with ERROR flag set - More data than required does produce a CAPTURE buffer but the rest of the data is discarded The hardware processes jobs that consume a single OUTPUT buffer and fill a single CAPTURE buffer. There are two ways to associate buffers with requests: 1) We can create requests that contain one OUTPUT buffer and one CAPTURE buffer. When a request completes, the OUTPUT and CAPTURE buffers contain the request FD they were queued with. These are the same buffers that were associated to the request before the request was queued. This works if there is a 1-to-1 relationship between output and capture buffers. 2) We can prequeue a set of buffers on the CAPTURE side, and create requests that contain one OUTPUT buffer. When a request completes, the next DQBUF call will return an unspecified buffer ID and the request FD of the request that just completed. This is typically used by codecs where the capture buffers can be out-of-order. - How to signal this to the application? Should we bother with 1) at all? 1.8. Sequence of calls for stateless codecs - Open the video device node V -> fd1 - Option TBD: Open the media controller device node M -> fd2? - Allocate a set of requests: #define MEDIA_REQ_CMD_ALLOC 0 #define MEDIA_REQ_CMD_DELETE 1 #define MEDIA_REQ_CMD_APPLY 2 #define MEDIA_REQ_CMD_QUEUE 3 #define MEDIA_REQ_CMD_INIT 4 struct media_request_cmd { __u32 cmd; __u32 request; /* Is a file descriptor */ __u32 base; /* Is a file descriptor */ }; for (...) { struct media_request_cmd cmd = { .cmd = MEDIA_REQ_CMD_ALLOC }; ret = ioctl(fd1 or fd2, MEDIA_IOC_REQUEST_CMD, &cmd); list_add(cmd.request); } - Process a frame: struct v4l2_control[] ctrls; struct v4l2_ext_controls ext_ctrls = { .controls = ctrls, .request = ... }; struct v4l2_buffer buf = { .index = ..., .request = ... }; struct media_request_cmd cmd = { .cmd = MEDIA_REQ_CMD_QUEUE, .request = ... }; ioctl(fd1, VIDIOC_S_EXT_CTRLS, &ctrls); ioctl(fd1, VIDIOC_QBUF, &buf); ... ioctl(fd1 or fd2, MEDIA_IOC_REQUEST_CMD, &cmd); 2 Conclusions - File descriptors are used to refer to requests - Request is validated when it is queued - On mem-to-mem devices ONLY, the requests are created on the video device in order to associate it to the mem-to-mem context. On all other devices, the requests are created on the media device. (TBC once this is implemented) - All mem-to-mem devices must have a per-file handle context. This is probably already the case, but this must be verified and documented. - Event created on request completion. - Finished requests cannot be requeued until you re-initialize the state to some initial state (either the HW state or that of another request). If it is re-initialized to itself, then nothing changes, other than that the request can now be requeued again. The reason for this is to force the application to think about the state that the request should contain. Another reason why you want to reuse requests is to avoid closing/opening fds all the time to avoid running out of fds in the middle of streaming, which can happen in e.g. a browser where some websites (facebook) request a huge number of fds. - The initial request state may originate either from the current device state or from a given request. - An actual request handling implementation in a framework should not copy the entire state but to refer to a different request or current device state instead. The aim here is to avoid copying large amount of state information that is unchanged in the most cases anyway. - The scope of the requests may be limited to less than the full configurability of the device - E.g. sensor exposure time is most likely out of scope of requests whereas buffers would go to the request - How to tell to the user what is the exact scope of the request API in the driver? 3 Work split Laurent: - Change the request API to use file descriptors to refer to requests - Core request API (allocate, delete, queue, clone) + QBUF Sakari: - Make sure each mem-to-mem device does use a context - Fix the drivers that do not have a context (if any) - Update the API documentation accordingly Tentative timeline for the actions above: first half of December (earlier if possible). Hans: - Use core request API to add control support Laurent: - Use core request API to add support S_FMT, S_SELECTION Last two actions are done in parallel. Timeline for these two actions: end of January. We really need a first workable patch series by then. 4 Open questions and random thoughts 4.1 What happens if an application associates a buffer to a request and then calls streamoff? We need to define what happens with that buffer and request. 4.2 Complex camera devices that have ISPs and I2C connected sensors or other sub-devices - I2C bus access takes time and the sensor settings need to be applied at a particular point of time, well before the settings take effect - In case of Android, the device specific HAL is aware of the timing model of the underlying hardware: - This reflects the current implementations - Splitting the timing model between the kernel and the user space would not improve the reliability of the system and would make resolving a complex problem even more complex: - Sophisticated timing model required, this is most likely at least somewhat specific to hardware - Interfaces required for ISP and sensor to provide information on their configuration latching points - The user space needs much of this information in order to keep requests going - Error handling in case the kernel cannot apply sensor settings in time 4.3 Asymmetric devices (that produce different number of buffers per a consumed buffer) - How do we handle M2M devices which could produce 0 or more buffers of output? - We know the maximum number of buffers that we will produce, but not the actual number - Output buffers can be held internally until enough data available to Capture first result. 4.4 Streamon and streamoff - How to handle STREAMON/OFF? - Pipelines without video nodes? - Buffer size may change. Need to reallocate buffers, which is currently done through streamoff, reqbufs and streamon. - We probably want a STREAM_CANCEL which would dequeue pending buffers (requests) without stopping streaming. 4.5 Concurrent access (or how to make requests for pipelines?) What to do if the topology of the media device can be configured in two independent pipelines? You would want to make requests for each pipeline, so containing a partial configuration (just for that pipeline) instead of for the full media state. There currently is no object which represents a pipeline. - Exclusive access to the request queue? Or entities? Or pads? Really: how to make requests for pipelines? (currently there is no object represent a pipeline) - Define error codes in advance - Implementations can try to request access to 'sub-parts' of pipelines - This should be done by requesting exclusive access to given pads (with possible exception of multiplexed pads) - Request should be validated against pads to which fd has exclusive access 4.6 Video buffer queues - Currently vb2 only deals with one queue at a time - Multiple queues has to be handled by the driver instead. This is quite painful if the queues are part of the same pipeline. - vb2 should help with multiple queues. Helper function / framework at the same level as the m2m framework? The basic idea is to e.g. synchronize starting streaming until all queues are ready for it. -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html