Re: per-frame camera metadata (again)

Laurent Pinchart <laurent.pinchart@xxxxxxxxxxxxxxxx> · Wed, 23 Dec 2015 19:40:55 +0200

Hi Guennadi,

On Tuesday 22 December 2015 12:16:27 Guennadi Liakhovetski wrote:
> On Mon, 21 Dec 2015, Laurent Pinchart wrote:
> > On Wednesday 16 December 2015 12:25:24 Guennadi Liakhovetski wrote:
> >> On Wed, 16 Dec 2015, Hans Verkuil wrote:
> >>> On 12/16/15 10:37, Guennadi Liakhovetski wrote:
> >>>> Hi all,
> >>>> 
> >>>> A project, I am currently working on, requires acquiringing
> >>>> per-frame metadata from the camera and passing it to user-space.
> >>>> This is not the first time this comes up and I know such discussions
> >>>> have been held before. A typical user is Android (also my case),
> >>>> where you have to provide parameter values, that have been used to
> >>>> capture a specific frame, to the user. I know Hans is working to
> >>>> handle one side of this process - sending per-request controls,
> >>> 
> >>> Actually, the request framework can do both sides of the equation:
> >>> giving back meta data in read-only controls that are per-frame. While
> >>> ideally the driver would extract the information from the binary blob
> >>> and put it in nice controls, it is also possible to make a control
> >>> that just contains the binary blob itself. Whether that's a good
> >>> approach depends on many factors and that's another topic.
> >> 
> >> Yes, sorry, didn't mention this possibility. On the one hand I agree,
> >> that this would look nice and consistent - you send a bunch of controls
> >> down and you get them back in exactly the same way, nicely taken apart.
> >> OTOH there are some issues with that:
> >> 
> >> 1. Metadata values can indeed come from the camera in a buffer, that's
> >> DMAed to a buffer by the bridge - we have such examples. In our
> >> use-cases those buffers are separate from main data, so, that the driver
> >> could allocate them itself, but can there be cases, in which those
> >> buffers have to be supplied by the user?
> > 
> > The only case I can think of where the user would benefit from supplying
> > the buffer is sharing meta data with other processes and/or devices *if*
> > the amount of meta data is so large that a memcpy would negatively affect
> > performances. And I can't think of such a case at the moment :-)
> 
> Ok, so, we could for now limit metadata buffer support to driver
> allocation.
> 
> >> 2. Size - not sure how large those control buffers can become, in
> >> use-cases, that I'm aware of we transfer up to 20 single-value
> >> parameters per frame.
> > 
> > I have to deal with a system that can transfer up to ~200 parameters per
> > frame (at least in theory).
> 
> Are they single-value (say, up to 32 bits) parameters or can be arrays /
> data chunks?

They can be arrays as well.

> >> 3. With control values delivered per DMA, it's the bridge driver, that
> >> gets the data, but it's the sensor subdevice driver, that knows what
> >> that buffer contains. So, to deliver those parameters to the user, the
> >> sensor driver control processing routines will have to get access to
> >> that metadata buffer. This isn't supported so far even with the proposed
> >> request API?
> > 
> > Correct. My current implementation (see
> > git://linuxtv.org/pinchartl/media.git drm/du/vsp1-kms/request) doesn't
> > deal with controls yet as the first use case I focused on for the request
> > API primarily requires setting formats (and links, which are my next
> > target).
> > 
> > My other use case (Android camera HAL v3 for Project Ara) mainly deals
> > with controls and meta-data, but I'll then likely pass the meta-data blob
> > to userspace as-is, as its format isn't always known to the driver. I'm
> > also concerned about efficiency but haven't had time to perform
> > measurements yet.
>
> Hm, why is it not known to the subdevice driver? Does the buffer layout
> depend on some external conditions? Maybe loaded firmware? But it should
> be possible to tell the driver, say, that the current metadata buffer
> layout has version N?

My devices are class-compliant but can use a device-specific meta-data format. 
The kernel driver knows about the device class only, knowledge about any 
device-specific format is only available in userspace.

> Those metadata buffers can well contain some parameters, that can also be
> obtained via controls. So, if we just send metadata buffers to the user as
> is, we create duplication, which isn't nice.

In my case there won't be any duplication as there will likely be no control 
at all, but I agree with you in the general case.

> Besides, the end user will anyway want broken down control values. E.g. in
> the Android case, the app is getting single controls, not opaque metadata
> buffers. Of course, one could create a vendor metadata tag "metadata blob,"
> but that's not how Android does it so far.
> 
> OTOH passing those buffers to the subdevice driver for parsing and
> returning them as an (extended) control also seems a bit ugly.
> 
> What about performance cost? If we pass all those parameters as a single
> extended control (as long as they are of the same class), the cost won't
> be higher, than dequeuing a buffer? Let's not take the parsing cost and
> the control struct memory overhead into account for now.

If you take nothing into account then the cost won't be higher ;-) It's the 
parsing cost I was referring to, including the cost of updating the control 
value from within the kernel.

> User-friendliness: I think, implementors would prefer to pass a complete
> buffer to the user-space to avoid having to modify drivers every time they
> modify those parameters.
> 
> >>>> but I'm not aware whether he or anyone else is actively working on
> >>>> this already or is planning to do so in the near future? I also
> >>>> know, that several proprietary solutions have been developed and are
> >>>> in use in various projects.
> >>>> 
> >>>> I think a general agreement has been, that such data has to be
> >>>> passed via a buffer queue. But there are a few possibilities there
> >>>> too. Below are some:
> >>>> 
> >>>> 1. Multiplanar. A separate plane is dedicated to metadata. Pros: (a)
> >>>> metadata is already associated to specific frames, which they
> >>>> correspond to. Cons: (a) a correct implementation would specify
> >>>> image plane fourcc separately from any metadata plane format
> >>>> description, but we currently don't support per-plane format
> >>>> specification.
> >>> 
> >>> This only makes sense if the data actually comes in via DMA and if it
> >>> is large enough to make it worth the effort of implementing this. As
> >>> you say, it will require figuring out how to do per-frame fourcc.
> >>> 
> >>> It also only makes sense if the metadata comes in at the same time as
> >>> the frame.
> >>> 
> >>>> 2. Separate buffer queues. Pros: (a) no need to extend multiplanar
> >>>> buffer implementation. Cons: (a) more difficult synchronisation with
> >>>> image frames, (b) still need to work out a way to specify the
> >>>> metadata version.
> >>>> 
> >>>> Any further options? Of the above my choice would go with (1) but
> >>>> with a dedicated metadata plane in struct vb2_buffer.
> >>> 
> >>> 3. Use the request framework and return the metadata as control(s).
> >>> Since controls can be associated with events when they change you can
> >>> subscribe to such events. Note: currently I haven't implemented such
> >>> events for request controls since I am not certainly how it would be
> >>> used, but this would be a good test case.
> >>> 
> >>> Pros: (a) no need to extend multiplanar buffer implementation, (b)
> >>> syncing up with the image frames should be easy (both use the same
> >>> request ID), (c) a lot of freedom on how to export the metadata. Cons:
> >>> (a) request framework is still work in progress (currently worked on
> >>> by Laurent), (b) probably too slow for really large amounts of
> >>> metadata, you'll need proper DMA handling for that in which case I
> >>> would go for 2.
> > 
> > (a) will eventually be solved, (b) needs measurements before discussing it
> > further.
> > 
> > > For (2) (separate buffer queue) would we have to extend VIDIOC_DQBUF to
> > > select a specific buffer queue?
> > 
> > Wouldn't it use a separate video device node ?
> 
> Ok, that seems like a better option to me too, agree.
> 
> >>>> In either of the above options we also need a way to tell the user
> >>>> what is in the metadata buffer, its format. We could create new
> >>>> FOURCC codes for them, perhaps as V4L2_META_FMT_... or the user
> >>>> space could identify the metadata format based on the camera model
> >>>> and an opaque type (metadata version code) value. Since metadata
> >>>> formats seem to be extremely camera-specific, I'd go with the latter
> >>>> option.
> >>>> 
> >>>> Comments extremely welcome.
> >>> 
> >>> What I like about the request framework is that the driver can pick
> >>> apart the metadata and turn it into well-defined controls. So the
> >>> knowledge how to do that is in the place where it belongs. In cases
> >>> where the meta data is simple too large for that to be feasible, then
> >>> I don't have much of an opinion. Camera + version could be enough.
> >>> Although the same can just as easily be encoded as a fourcc
> >>> (V4L2_META_FMT_OVXXXX_V1, _V2, etc). A fourcc is more consistent with
> >>> the current API.
> >> 
> >> Right, our use-cases so far don't send a lot of data as per-frame
> >> metadata, no idea what others do.
> > 
> > What kind of hardware do you deal with that sends meta-data ? And over
> > what kind of channel does it send it ?
> 
> A CSI-2 connected camera sensor.

Is meta-data sent as embedded data lines with a different CSI-2 DT ?

-- 
Regards,

Laurent Pinchart

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html