On Mon, Dec 02, 2024 at 09:05:07AM +0100, Hans Verkuil wrote: > On 02/12/2024 01:18, Laurent Pinchart wrote: > > On Fri, Nov 29, 2024 at 11:18:54PM +0100, Ricardo Ribalda wrote: > >> On Fri, 29 Nov 2024 at 23:03, Laurent Pinchart wrote: > >>> On Fri, Nov 29, 2024 at 07:47:31PM +0100, Ricardo Ribalda wrote: > >>>> Before we all go on a well deserved weekend, let me recap what we > >>>> know. If I did not get something correctly, let me know. > >>>> > >>>> 1) Well behaved devices do not allow to set or get an incomplete async > >>>> control. They will stall instead (ref: Figure 2-21 in UVC 1.5 ) > >>>> 2) Both Laurent and Ricardo consider that there is a big chance that > >>>> some camera modules do not implement this properly. (ref: years of > >>>> crying over broken module firmware :) ) > >>>> > >>>> 3) ctrl->handle is designed to point to the fh that originated the > >>>> control. So the logic can decide if the originator needs to be > >>>> notified or not. (ref: uvc_ctrl_send_event() ) > >>>> 4) Right now we replace the originator in ctrl->handle for unfinished > >>>> async controls. (ref: > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/media/usb/uvc/uvc_ctrl.c#n2050) > >>>> > >>>> My interpretation is that: > >>>> A) We need to change 4). We shall not change the originator of > >>>> unfinished ctrl->handle. > >>>> B) Well behaved cameras do not need the patch "Do not set an async > >>>> control owned by another fh" > >>>> C) For badly behaved cameras, it is fine if we slightly break the > >>>> v4l2-compliance in corner cases, if we do not break any internal data > >>>> structure. > >>> > >>> The fact that some devices may not implement the documented behaviour > >>> correctly may not be a problem. Well-behaved devices will stall, which > >>> means we shouldn't query the device while as async update is in > >>> progress. Badly-behaved devices, whatever they do when queried, should > >>> not cause any issue if we don't query them. > >> > >> I thought we could detect the stall and return safely. Isn't that the case? > > > > We could, but if we know the device will stall anyway, is there a reason > > not to avoid issuing the request in the first place ? > > > >> Why we have not seen issues with this? > > > > I haven't tested a PTZ device for a very long time, and you would need > > to hit a small time window to see the issue. > > > >>> We should not send GET_CUR and SET_CUR requests to the device while an > >>> async update is in progress, and use cached values instead. When we > >>> receive the async update event, we should clear the cache. This will be > >>> the same for both well-behaved and badly-behaved devices, so we can > >>> expose the same behaviour towards userspace. > >> > >> seting ctrl->loaded = 0 when we get an event sounds like a good idea > >> and something we can implement right away. > >> If I have to resend the set I will add it to the end. > >> > >>> We possibly also need some kind of timeout mechanism to cope with the > >>> async update event not being delivered by the device. > >> > >> This is the part that worries me the most: > >> - timeouts make the code fragile > >> - What is a good value for timeout? 1 second, 30, 300? I do not think > >> that we can find a value. > > > > I've been thinking about the implementation of uvc_fh cleanup over the > > weekend, and having a timeout would have the nice advantage that we > > could reference-count uvc_fh instead of implementing a cleanup that > > walks over all controls when closing a file handle. I think it would > > make the code simpler, and possibly safer too. > > > >>> Regarding the userspace behaviour during an auto-update, we have > >>> multiple options: > >>> > >>> For control get, > >>> > >>> - We can return -EBUSY > >>> - We can return the old value from the cache > > This would match the control behavior best. Only when the operation is > done is the control updated and the control event sent. > > Some questions: is any of this documented for UVC? Because this is non-standard No this isn't documented. > behavior. Are there applications that rely on this? Should we perhaps add I don't know. > proper support for this to the control framework? E.g. add an ASYNC flag and > document this? We could, but this is such a specific use case that I don't think is worth adding complexity to the already complex control framework would be worth it. What we could do is perhaps adding a flag for the userspace API, but even there, I never like modelling an API with a single user. > >>> - We can return the new value fromt he cache > >>> > >>> Returning -EBUSY would be simpler to implement. > >> > >> Not only easy, I think it is the most correct, > >> > >>> I don't think the behaviour should depend on whether the control is read > >>> on the file handle that initiated the async operation or on a different > >>> file handle. > >>> > >>> For control set, I don't think we can do much else than returning > >>> -EBUSY, regardless of which file handle the control is set on. > >> > >> ACK. > >> > >>>> I will send a new version with my interpretation. > >>>> > >>>> Thanks for a great discussion > >> > >> Looking with some perspective... I believe that we should look into > >> the "userspace behaviour for auto controls" in a different patchset. > >> It is slightly unrelated to this discussion. -- Regards, Laurent Pinchart