Re: [PATCH 0/2] V4L: Extended crop/compose API

Tomasz Stanislawski <t.stanislaws@xxxxxxxxxxx> · Wed, 18 May 2011 18:55:51 +0200

Laurent Pinchart wrote:
On Saturday 14 May 2011 12:50:32 Hans Verkuil wrote:
On Friday, May 13, 2011 14:43:08 Laurent Pinchart wrote:
On Saturday 07 May 2011 13:52:25 Hans Verkuil wrote:
On Thursday, May 05, 2011 11:39:54 Tomasz Stanislawski wrote:
Hi Laurent and Hans,
I am very sorry for replying so lately. I was really busy during last 
week. Thank you very much for your interest and comments :).

[snip]

>> -----------------------------------------------------------------
>>>>>
>>>>> Name
>>>>>
>>>>> 	VIDIOC_S_SELECTION - set cropping rectangle on an input of a device

[snip]

>>>>> v4l2-selection::r is filled with suggested coordinates. The 
coordinates
>>>>> are expressed in driver-dependant units.
>
> What kind of driver-dependant units do you think will be used in 
practice ?

In a case of sensors, units could be subpixels. For analog TV inputs, 
units may be milliseconds. For printers/scanner, units may be centimeters.
I think that there are many other examples out there.

Applications set V4L2_SEL_TRY flag in v4l2_selection::flags to
prevent a driver from applying selection configuration to hardware.
I mentioned this before but I am very much opposed to this flag. It is
inconsistent with the TRY_FMT and TRY_EXT_CTRLS ioctls. Note that in
video_ioctl2 it should be just one function with 'try' boolean
argument. It has always been a mistake that the try and set functions
were separated in the driver code IMHO.

I know that the subdev ioctls do not have a try version, but it is not
quite the same in that those try functions actually store the try
information.
That's exactly why the subdevs pad-level API has a try flag instead of a
try operation, and that's why g/s_selection on subdevs will be done with
a try flag.

As for the video device node API, I won't oppose a TRY ioctl, as long as
we can guarantee there will never be dependencies between the selection
rectangles and other parameters (or between the different selection
rectangles). If the crop rectangle depends on the compose rectangle for
instance, how can you implement a TRY ioctl to try a crop rectangle for
a specific compose rectangle, without modifying the active compose
rectangle first ?
In that case the TRY would adjust the crop so that it works with the
current compose rectangle.

And how do you try both crop and compose settings without modifying the active 
configuration ? That's not possible, and I think it's a bad API limitation.

VIDIOC_TRY_MULTISELECTION ?

But this is just one more example of our lack of proper support for pipeline
setup. It doesn't really matter whether this is at the subdev level or at
the driver level, both have the same problems.

This requires a brainstorm session to work out.

This is something we need to look into more carefully. I am slowly
becoming convinced that we need some sort of transaction-based
configuration for pipelines.
This RFC is about video device node configuration, not pipelines. For
pipelines I think we'll need a transaction-based API. For video device
nodes, I'm still unsure. As stated above, if we have multiple parameters
that depend on each other, how do we let the user try them without
changing the active configuration ?
Cropping/scaling/composing IS a pipeline. Until recently the V4L2 device
node API was sufficient to setup the trivial pipelines that most V4L2
consumer devices have. But with the more modern devices it starts to show
its limitations.

It's still a simple pipeline, and I think we should aim at making the V4L2 
device node API useful to configure this kind of pipeline. The selection API 
is a superset of the crop API, applications should be able to use it to 
replace the crop API in the long term.

The whole 'try' concept we had for a long time needs to be re-examined.

I agree.

As you remember, I was never satisfied with the subdev 'try' approach
either, but I never could come up with a better alternative.

I've noticed that there are two different meaning of TRY flag
a) checking if a proposed configuration is applicable for a driver
b) storing proposed configuration in some form of temporary buffer

Ad. a. This is a real TRY command. The state of both hardware and driver 
does not change after TRY call.

Ad. b. It should be called not TRY flag because the internal state of a 
driver changes. It should be named as something like SHADOW flag. It 
would indicate that the change is applied only to shadow configuration.

I propose to change name of TRY flag for subdev to SHADOW flag. I think 
it would much clear to express the difference of TRY meaning in video 
node and subdev contexts.

Therefore ioctl VIDIOC_TRY_SELECTION is probably better and more 
convenient way of testing if configuration is applicable.

Regardless of how such a scheme should work, one thing that I believe
is missing in the format ioctls (both on the video and subdev level)
is a similar concept like the flags in this proposal. It would be
quite useful if you could indicate that the desired WxH has to be
exact or can be larger or smaller. It would certainly simplify the
interaction between the selection and scaling.

On success field v4l2_selection::r is filled with adjusted rectangle.

Return value

	On success 0 is returned, on error -1 and the errno variable is set

        appropriately:
	EINVAL - incorrect buffer type, incorrect/not supported target

4. Flags
4.1. Hints

The field v4l2_selection::flags is used to give a driver a hint about
coordinate adjustments. The set of possible hint flags was reduced to
two entities for practical reasons. The flag V4L2_SEL_SIZE_LE is a
suggestion for the driver to decrease or keep size of a rectangle. 
The flags V4L2_SEL_SIZE_GE imply keeping or increasing a size of a
rectangle.

By default, lack of any hint flags indicate that driver has to choose
selection coordinates as close as possible to the ones passed in
v4l2_selection::r field.

Setting both flags implies that the driver is free to adjust the
rectangle.  It may return even random one however much more
encouraged behaviour would be adjusting coordinates in such a way
that other stream parameters are left intact. This means that the
driver should avoid changing a format of an image buffer and/or any
other controls.
This makes no sense to me. It sounds like this is what flags == 0
should do.

I would expect that if both flags are specified that that would equal
SEL_SIZE_EQ. I.E., the rectangle has to be exact size-wise, and should
be as close as possible to the original position.
What happens if that's not possible ? The ioctl should never return an
error,
Why not? If I tell the driver that I want exactly WxH, then I see no reason
why it can't return an error. An application might use that result to
switch to a different resolution, for example. E.g., the application tries
640x480 first, that fails, then it tries 320x240 (or whatever).

To make the API more consistent. Applications ask drivers for specific 
settings (including optional hints), and drivers return what they've been able 
to configure. It's then the application's responsibility to check the return 
values and act upon it. Drivers shouldn't return an error when setting 
formats/selections, except if the given arguments can't be understood.
Hmm.. I see two solutions:

Solution I (more restrictive):
0 - driver is free to adjust size, it is recommended to choose the 
crop/compose rectangle as close as possible to desired one

SEL_SIZE_GE - drive is not allowed to shrink the rectangle. If no such a 
rectangle exists ERANGE is returned (EINVAL is used for 
not-understandable configuration)

SEL_SIZE_LE - drive is not allowed to grow the rectangle. If no such a 
rectangle exists ERANGE is returned (EINVAL is used for 
not-understandable configuration)

SEL_SIZE_EQ = SEL_SIZE_GE | SEL_SIZE_LE - choose size exactly the same 
as in desired rectangle. Return ERANGE if such a configuration is not 
possible.

-----------------------------------------

Solution II (less restrictive). Proposed in this RFC.

0 - apply rectangle as close as possible to desired one like the default 
behavior of  VIDIOC_S_CROP.

SEL_SIZE_GE - suggestion to increase or keep size of both coordinates

SEL_SIZE_LE - suggestion to decrease or keep size of both coordinates

SEL_SIZE_GE | SEL_SIZE_LE - technically suggestion to "increase or keep 
or decrease" sizes. Basically, it means that driver is completely free 
to choose coordinates. It works like saying "give me a crop similar to 
this one" to the driver. I agree that it is not "a very useful" 
combination of flags.

In both solutions, the driver is recommended to keep the center of the 
rectangle in the same place.

Personally, I prefer 'solution I' because it is more logical one.
One day, the SEL_SIZE_GE could be expanded to LEFT_LE | RIGHT_GE | 
TOP_LE | BOTTOM_GE flags if drivers could support it.

[snip]

>>> One thing I think would be helpful though, is if the target name 
would tell
>>> you whether it is a read-only or a read-write target. It might also be
>>> helpful if the IDs of the read-only targets would set some 
read-only bit.
>>> That would make it easy for video_ioctl2 to test for invalid 
targets for
>>> S_SELECTION.
>>>
>>> Not sure about the naming though:
>>>
>>> V4L2_SEL_RO_CROP_DEFAULT
>>> V4L2_SEL_CROP_RO_DEFAULT
>>> V4L2_SEL_CROP_DEFAULT_RO
>>>
>>> None looks right. A read-only bit might be sufficient as it would 
clearly
>>> indicate in the header that that target is a read-only target.
>> What if some future hardware have setable default or bounds 
rectangles ? I
>> don't know what that would be used for, it's just in case :-)
>
> If it is settable, then it is no longer a default or bounds rectangle but
> some other rectangle :-)
>

I agree with Laurent. Both bounds and defrect are not exactly read-only.
They could be modified by S_FMT, S_STD/S_DV_PRESET ioctl.
The DEFRECT can change after switching aspect ratio on TV output.

If cropping is not supported setting ACTIVE target returns EINVAL, 
because this target is read-only in this context.
Change of ACTIVE target might be also impossible because hardware is in 
streaming state.

Basically, ant target could be Read-only because of some reason.

Therefore I see no reason to break symmetry between 
active/bound/defrect/? by adding RO {pre/suff}ix.

[snip]

All cropcap fields except pixel aspect are supported in new API. I
noticed that there was discussion about pixel aspect and I am not
convinced that it should be a part of the cropping API. Please refer
to the post:
http://lists-archives.org/video4linux/22837-need-vidioc_cropcap-clari
fica tion.html
Yeah, what are we going to do about that? I agree that it does not
belong here. But we can't sweep it under the carpet either.

The pixel aspect ratio is typically a property of the current input or
output video format (either a DV preset or a PAL/NTSC STD). For DV
presets we could add it to struct v4l2_dv_enum_presets and we could do
the same for STD formats by adding it to struct v4l2_standard.
Cropping and composing doesn't modify the pixel aspect ratio, so I agree
it doesn't belong there.

This would fail for sensors, though, since there the chosen sensor
framesize is set through S_FMT. (This never quite made sense to me,
though, from a V4L2 API perspective). I'm not sure whether we can
always assume 1:1 pixel ratio for sensors. Does anyone know?
Most of the time I suppose so, but I wouldn't be surprise if some exotic
sensors had non-square pixel aspect ratios.
Would it make sense to add the pixel ratio to struct v4l2_frmsizeenum?

And what about adding a VIDIOC_G/S_FRAMESIZE to select a sensor resolution?

This would typically be one of the discrete framesizes supported by a
sensor through binning/skipping. If there is also a scaler on the sensor,
then that is controlled through S_FMT. For video it is S_FMT that controls
the scaler (together with S_CROP at the moment), but the source resolution
is set through S_STD/S_DV_PRESET/S_DV_TIMINGS. It always felt very
inconsistent to me that there is no equivalent for sensors, even though
you can enumerate all the available framesizes (just as you can with
ENUMSTD and ENUM_DV_PRESETS).

The problem is that S_FMT is used to configure two entities:
- memory buffer
- sensor
The selection API help to configure crop/compose/scaling. Therefore if 
selection would be accepted in V4L, then it may be worth to consider 
separation of memory and sensors configuration.

I agree that sensors need a dedicated ioctl of {G,S,TRY,ENUM}_SENSOR 
family. I do not feel competent in this matter but I think that the 
ioctl should support feature typical for sensors like:
- array size (before scaling)
- array shape
- (sub)pixel ratio or equivalent
- binning/skipping

The V4L is designed for simple pipelines like one below:

Input ---->  Processing ----> Output

In sensor case we have:

Sensor array  ----->  Processing --------> Memory
- resolution          - compose            - format
- binning             - crop               - resolution bounds
- skipping            - scaling            - size in bytes (!)
                                           - alignment/bytesperline

For TV output the pipeline is following:

Memory       ------> Processing --------> TV output
- format             - compose            - tv standard
- bounds             - crop               - timing
- size               - scaling            - DAC encoding
- alignment

I think that memory(buffers) should be configured using S_FMT.
Sensors should be configured with new S_SENSOR ioctl.

I do not see how this approach could break backward compatibility?
The only problem might be that the sensor may not work optimally is its 
default resolution is too big in comparison to buffer resolution.
Choosing optimal configuration of sensor/scaler is done in MC.
Applying overmentioned approach, it could be also configured in video node.

What is your opinion about this idea?

Let's take one step back here.

We started with the V4L2 device node API to control (more or less) simple 
devices. Device became more complex, and we created a new MC API (along with 
the subdev pad-level API) to configure complex pipelines. The V4L2 device node 
API still lives, and we want to enhance it to configure medium complexity 
devices.

Before going much further, I think we need to define what a medium complexity 
device is and where we put the boundary between devices that can be configured 
with the V4L2 device node API, and devices that require the MC API.

I believe this shouldn't be too difficult. What we need to do is create a 
simple virtual pipeline that supports cropping, scaling and composing, and map 
the V4L2 device node API to that pipeline configuration. Devices that map to 
that pipeline could then use the V4L2 device node API only, with clearly 
defined semantics.

[snip]

  * resolution of an image combined with support for
  VIDIOC_S_MULTISELECTION

    allows to pass a triple format/crop/compose sizes in a single
    ioctl
I don't believe S_MULTISELECTION will solve anything. Specific
use-cases perhaps, but not the general problem of setting up a
pipeline. I feel another brainstorm session coming to solve that. We
never came to a solution for it in Warsaw.
Pipelines are configured on subdev nodes, not on video nodes, so I'm also
unsure whether multiselection support would really be useful.

Passing compose and crop rectangle in single ioctl might help in some 
cases, like the one described here:
http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/27581

If we decide to implement multiselection support, we might as well use
that only. We would need a multiselection target bitmask, so selection
targets should all be < 32.
I was considering something like this:

struct v4l2_multiselection {
	int n; /* number of targets */
	struct v4l2_selection s[0]; /* 0-length array */
};

There is no need for a bitmask.

Thinking some more about it, does it make sense to set both crop and
compose on a single video device node (not talking about mem-to-mem,
where you use the type to multiplex input/output devices on the same
node) ? If so, what would the use cases be ?

Configuration of multi input/output devices could be realised by adding 
extra targets and passing them all using MULTISELECTION.

To sum up, the multiselection is only an brainstorm idea. I think that 
transaction-based API is simpler and more robust.
Multiselection would be realized by passing multiple VIDIOC_S_SELECTION
inside single transaction window.

Best regards,
Tomasz Stanislawski

Should all devices support the selection API, or only the simple ones
that don't expose a user-configurable pipeline to userspace through the
MC API ? The proposed API looks good to me, but before acking it I'd
like to clarify how (if at all) this will interact with subdev pad-level
configuration on devices that support that. My current feeling is that
video device nodes for such devices should only be used for video
streaming. All parameters should be configured directly on the subdevs.
Application might still need to call S_FMT in order to be able to
allocate buffers though.
This comes back to how we want to implement backwards compatibility for
existing applications. There must be a way for 'standard' apps to work
with complex drivers for specific video nodes (the mc would probably mark
those as a 'DEFAULT' node).

I'd say that there are roughly two options: either implement the selection
etc. APIs for those video nodes only in the driver, letting the driver set
up the subdev pipeline, or do it via libv4l SoC-specific plugins.

In my opinion we need to finish the pipeline configuration topic we started
in Warsaw before we can finalize this RFC. This RFC clearly demonstrates
that we have inconsistencies and deficiencies in our API that need to be
solved first. When we have done that, then I expect that this selection
API will be easy to finalize.

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html