Re: [RFC 00/19] Async sub-notifiers and how to use them

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Niklas,

On Wednesday, 23 August 2017 15:59:10 EEST Niklas Söderlund wrote:
> On 2017-08-23 11:09:15 +0200, Hans Verkuil wrote:
> > On 08/04/17 20:25, Sakari Ailus wrote:
> >> Niklas Söderlund wrote:
> >>> On 2017-07-20 19:14:01 +0300, Sakari Ailus wrote:
> >>>> On Wed, Jul 19, 2017 at 01:42:55PM +0200, Hans Verkuil wrote:
> >>>>> On 18/07/17 21:03, Sakari Ailus wrote:
> >>>>>> Hi folks,
> >>>>>> 
> >>>>>> This RFC patchset achieves a number of things which I've put to the
> >>>>>> same patchset for they need to be show together to demonstrate the
> >>>>>> use cases.
> >>>>>> 
> >>>>>> I don't really intend this to compete with Niklas's patchset but
> >>>>>> much of the problem area addressed by the two is the same.
> >>>>>> 
> >>>>>> Comments would be welcome.
> >>>>>> 
> >>>>>> - Add AS3645A LED flash class driver.
> >>>>>> 
> >>>>>> - Add async notifiers (by Niklas).
> >>>>>> 
> >>>>>> - V4L2 sub-device node registration is moved to take place at the
> >>>>>>   same time with the registration of the sub-device itself. With
> >>>>>>   this change, sub-device node registration behaviour is aligned
> >>>>>>   with video node registration.
> >>>>>> 
> >>>>>> - The former is made possible by moving the bound() callback after
> >>>>>>   sub-device registration.
> >>>>>> 
> >>>>>> - As all the device node registration and link creation is done as
> >>>>>>   the respective devices are probed, there is no longer dependency
> >>>>>>   to the notifier complete callback which as itself is seen
> >>>>>>   problematic. The complete callback still exists but there's no
> >>>>>>   need to use it, pending changes in individual drivers.
> >>>>>>   
> >>>>>>   See:
> >>>>>>   <URL:http://www.spinics.net/lists/linux-media/msg118323.html>
> >>>>>>   
> >>>>>>   As a result, if a part of the media device fails to initialise
> >>>>>>   because it is e.g. physically broken, it will be possible to use
> >>>>>>   what works.
> >>>>> 
> >>>>> I've got major problems with this from a userspace point of view. In
> >>>>> the vast majority of cases you just want to bail out if one or more
> >>>>> subdevs fail.
> >>>>
> >>>> I admit it's easier for the user space if the device becomes available
> >>>> only when all its component drivers have registered.
> >>>> 
> >>>> Also remember that video nodes are registered in the file system right
> >>>> on device probe time. It's only sub-device and media device node
> >>>> registration that has taken place in the notifier's complete handler.
> >>> 
> >>> Is this always the case? In the R-Car VIN driver I register the video
> >>> devices using video_register_device() in the complete handler. Am I
> >>> doing things wrong in that driver? I had a patch where I moved the
> >>> video_register_device() call to probe time but it got shoot down in
> >>> review and was dropped.
> >> 
> >> I don't think the current implementation is wrong, it's just different
> >> from other drivers; there's really no requirement regarding this AFAIU.
> >> It's one of the things where no attention has been paid I presume.
> > 
> > It actually is a requirement: when a device node appears applications can
> > reasonably expect to have a fully functioning device. True for any device
> > node. You don't want to have to wait until some unspecified time before
> > the full functionality is there.
> > 
> > I try to pay attention to this when reviewing code, since not following
> > this rule basically introduces a race condition which is hard to test.
> 
> In the latest version of the R-Car VIN series I have now moved the video
> device registration to happen at probe time... So I think it would be a
> good time to clarify what and what is not the intended way of where this
> can happen. I'm not keen on reworking that series for each time it's
> posted to where the video device is registered :-) I can see both good
> and bad things with both solutions.

I agree, let's get to the bottom of this issue.

> If the video device is registered in the complete callback it do make it
> easier to spot some race conditions (my VIN series got review comments
> where I missed this almost instantly).

I'm not sure I agree with that. It removes several potential race conditions, 
but I don't see how it makes it easier to spot any of them.

> It is also clear to user-space when a device is ready to be used. At the
> same time as Sakari points out this prevents partially complete graphs which
> might contain a valid pipeline to be able to function, which of these two
> behaviors is the most opportune I assume differs with use-cases and which
> one is best from a framework point of view I don't know.

For drivers that support pipelines with multiple sources I don't think there's 
any disagreement. We need to support partial pipelines, and thus need to 
register video nodes at probe time. The case that we're debating here is vdev-
centric drivers that expose a single video node and no subdev-node, and have a 
single source. Is there a general agreement on this ?

> But I do know that if a video device is registered from the complete
> callback it's reasonable that it should be unregistered if the unbind
> callback is called, right?

I don't see any other way it could work in that case, do I'd say yes.

> Else the same situation as registering it at probe time is reached if a
> subdevice is ever unbound and the driver needs to handle the corner cases of
> both situations. And this use-case is today broken in v4l2! If a video
> device is registered in the complete callback, unregistered in the unbind
> callback and later re-registered in the complete callback once the subdevice
> is re-bound everything blows up with
> 
>   kobject (eb3be918): tried to init an initialized object, something is
> seriously wrong.

That's because reusing a struct device is wrong. In this case the video_device 
structure needs to be allocated dynamically at .complete() time, and 
unregistered at .unbind() time. It will then be freed from its .release() 
handler when the last reference will disappear, which could be way after the 
subdev is unbound or even rebound. In the latter case two struct video_device 
will coexist for some time.

> But yes if the video device is registered at probe time there are more
> races and object life-time issues for the driver to handle, but these
> needs to be considered anyhow if the unbind/re-bind scenario is to be
> fixed, right?

I believe so.

> So maybe it don't really matter where the video device is registered and
> both methods should be allowed and documented (so all drivers returns same
> -ENOSUBDEVBOUNDYET etc) and leave it up to each driver to handle this for
> how it perceives it primary use-case to be?

I have no strong opinion on whether the decision should be left to driver 
authors, but I believe we should have at least the option of registering video 
device nodes a probe time.

> And instead we should talk about how to fix the bind/unbind issues as this
> is where IMHO where the real problem is.

That I agree with. There's so many object lifetime management issues in V4L2 
that it's hard not to run into one of them during development (as proved again 
but this mail thread :-)). For me this is the number one priority. I'm working 
on it with Sakari this afternoon.

> >> However doing anything that can fail earlier on would be nicer since
> >> there's no reasonable way to signal an error from complete callback
> >> either.
> > 
> > Right.
> > 
> > Adding support for cases where devices may not be present is very
> > desirable, but this should go through an RFC process first to hammer out
> > all the details.
> > 
> > Today we do not support this and we have to review code with that in mind.
> > 
> > So the first async subnotifiers implementation should NOT support this
> > (although it can of course be designed with this in mind). Once it is in
> > we can start on an RFC on how to support partial pipelines. I have a lot
> > of questions about that that need to be answered first.
> > 
> > One thing at a time. Trying to do everything at once never works.

-- 
Regards,

Laurent Pinchart





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux