On 7/27/23 17:07, Tomasz Figa wrote:
CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
On Tue, Jul 4, 2023 at 5:41 PM Hsia-Jun Li <Randy.Li@xxxxxxxxxxxxx> wrote:
Hello All
This RFC will address the problem that some ioctl() commands would be
called many times until that command got all the all its entries. Those
ioctl() command are:
- VIDIOC_ENUM_FMT
- VIDIOC_QUERYCTRL
- VIDIOC_ENUMSTD and VIDIOC_SUBDEV_ENUMSTD
Generally speaking, all enumeration type commands would lead to
frequently context switch between userspace and kernel space. A few
enumeration commands listed below may not meet this problem in some
cases, as they could present their entries in a step wise way.
- VIDIOC_ENUM_FRAMESIZES
- VIDIOC_ENUM_FRAMEINTERVALS
Right, it's something that often comes in conversations with user
space developers on how it's difficult to query information from V4L2.
That said, we've mostly found it as an annoyance rather than a
practical problem. Do you have some specific case where the current
approach of enumeration causes a problem for the application (e.g.
performance)?
I have to admit the performance is not a problem for enumeration of
pixel formats. We only do that in the setup step.
The original problem before ext_pixel_format is enumeration of the
colorspace, 12 variant of colorimetries for a pixel format.
After we began to disccus the v4l2 ext APIs, I want to remove the
barrier that we can't update the v4l2 uAPI easily under the size limitation.
A simple solution that we could bring and improve from DRM is the blob
object(struct drm_property_blob).
We could extend the existing ioctl() in this way:
1. VIDIOC_ENUM_EXT_FMT would turn a blob id and the memory size
requirement that usespace should prepare
for storing.
2. Appication call VIDIOC_GETPROPBLOB with blob id and a userspace
pointer which should be enough for storing.
3. V4L2 framework fill the that userptr with context likes this:
struct v4l2_blob_prop {
__u32 version;
__u32 payload_size;
__u32 payload[];
};
4. The parsing of payload would depend on its version which
v4l2_blob_prop.version says, and each entry in the payload could be
variable length, likes this:
struct v4l2_ext_pix_mod_desc {
__u64 modifier;
__u64 allocate_hints; /* heap flags shard by DMA-heap */
__u32 num_of_planes;
__u32 plane_sizes[8];
__u32 colorimetry_flags;
};
struct v4l2_ext_pix_desc {
__u32 fourcc;
__u32 num_of_modifies;
struct v4l2_ext_pix_mod_desc[];
};
Since I'm not familiar with the DRM blob approach it might be an
obvious thing, but what happens when the application was written when
DRM doesn't have such design about API version.
let's say the version was 3, but the kernel it's running on now was
upgraded to version 4?
Nope, the API version is not selected by the userspace but kernel.
It is something likes the protocol version in wayland.
We would careful make a version of API but we don't not take the
responsibility for the back-compatible for the application.
In this design, we could avoid the problem that we could hardly extend
our uAPI for V4L2, all the structure must be a fixed size.
Here are some options design that people want for this RFC:
1. Do we need to call the ioctl() command itself(likes
VIDIOC_ENUM_EXT_FMT) which let the driver to flush its internal property
cache or calling VIDIOC_GETPROPBLOB is enough?
Wouldn't the former make it thread-unsafe?
Why we need thread safe here? V4L2 don't offer such thing.
I'd imagine something like this:
struct v4l2_property_array {
__u32 version; // Could be equal to the length of the struct
It should not be. version means the structure of the payload.
void __user *buf;
__u32 len;
__u32 num_formats;
__u32 num_frame_sizes;
__u32 num_modifiers;
__u32 num_controls;
__u32 num_menus;
// More fields could be added here for more types of
information to query, with next versions, which could be ignored by
old userspace.
You know what this buffer for(the property that you want) before this
step. That is not necessary to have a union reporting the items in the
payload.
A struct(or an item) in the payload(many items) itself could have a
variable array. How many items in the payload may not help on parsing
for the userspace.
};
If out_buf is NULL, the kernel fills len and num_* fields and returns.
Then the userspace allocates a big enough buffer and calls the ioctl
again with buf != NULL, which fills in the buffer pointed by buf as
below.
The maximum requirement size would be confirmed at the first step, a
ioctl() likes VIDIOC_ENUM_EXT_FMT.
{
(struct v4l2_ext_format[]) {
[0] = { ... },
[num_formats - 1] = { ... }
}
(struct v4l2_frame_size[]) {
[0] = { ... },
[num_frame_sizes - 1] = { ... }
}
// [...]
(struct v4l2_ctrl_menu[]) {
[0] = { ... },
[num_menus - 1] = { ... }
}
}
I think we could also let the user space fill in the version and num_
(presumably with 0) to control what information it wants the kernel to
return.
2. Should we make MC node support this feature only or standard video
node could? A thought from pinchartl is that every driver should have a
MC node even for the stateful driver.
The futuristic model that we envisioned back in time would be that
there is a media device that gives the user space ability to perform
operations on all the interfaces of the respective media controller
(or even all interfaces in the system - but that would require some
security considerations). In such a model, the userspace could prepare
an array (or blob if you prefer) of operations, where each operation
refers to an interface entity ID on which to execute the operation. It
would allow us to submit complete requests (or even many of them) in a
single ioctl, reducing the overhead significantly.
That said, it would be a complete overhaul of the API, so it might be
a step too big? Not sure. We certainly could benefit from V4L3. ;)
I thought of that. But I try to convince people those are not a big
change but essential.
The implementation of RFC could be a foundation for ext pixel and ext
buffer APIs. I would like to know your feedback before we settle the
problem with the ext pixel format.
I feel like this kind of batch operation would be more important for
buffer management, because it could allow dequeuing multiple buffers
in one ioctl, which is actually something that is starting to show up
as a real performance bottleneck for complex devices, such as ISPsI think I make it for MC node from the idea of Pinchart.
(We have a MediaTek ISP that has a really big number of DMA engines,
which accounts to about 200 DQBUFs per frame. Let me add +Yunke Cao
I need to know how it happens.
who's been looking into what performance effects it has and how we
could improve it.)
We need to hurry up on settling the design of m2m or v4l2 framework.
We meet the fourcc namespace problem at least 5 years ago, but we still
didn't fix it.
--
Hsia-Jun(Randy) Li
--
Hsia-Jun(Randy) Li