Re: [RFC v3 00/21] Make use of kref in media device, grab references as needed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Tuesday 13 Dec 2016 15:23:53 Shuah Khan wrote:
> On 12/13/2016 05:24 AM, Mauro Carvalho Chehab wrote:
> > Em Tue, 13 Dec 2016 12:53:05 +0200 Sakari Ailus escreveu:
> >> On Tue, Nov 29, 2016 at 09:13:05AM -0200, Mauro Carvalho Chehab wrote:
> >>> Hi Sakari,
> >>> 
> >>> I answered you point to point below, but I suspect that you missed how
> >>> the current approach works. So, I decided to write a quick summary here.
> >>> 
> >>> The character devices /dev/media? are created via cdev, with relies on a
> >>> kobject per device, with has an embedded struct kref inside.
> >>> 
> >>> Also, each kobj at /dev/media0, /dev/media1, ... is associated with a
> >>> 
> >>> struct device, when the code does:
> >>> 	devnode->cdev.kobj.parent = &devnode->dev.kobj;
> >>> 
> >>> before calling cdev_add().
> >>> 
> >>> The current lifetime management is actually based on cdev's kobject's
> >>> refcount, provided by its embedded kref.
> >>> 
> >>> The kref warrants that any data associated with /dev/media0 won't be
> >>> freed if there are any pending system call. In other words, when
> >>> cdev_del() is called, it will remove /dev/media0 from the filesystem,
> >>> and
> >>> will call kobject_put().
> >>> 
> >>> If the refcount is zero, it will call devnode->dev.release(). If the
> >>> kobject refcount is not zero, the data won't be freed.
> >>> 
> >>> So, in the best case scenario, there's no opened file descriptors
> >>> by the time media device node is unregistered. So, it will free
> >>> everything.
> >>> 
> >>> In the worse case scenario, e. g. when the driver is removed or
> >>> unbind while /dev/media0 has some opened file descriptor(s),
> >>> the cdev logic will do the proper lifetime management.
> >>> 
> >>> On such case, /dev/media0 disappears from the file system, so another
> >>> open
> >>> is not possible anymore. The data structures will remain allocated until
> >>> all associated file descriptors are not closed.
> >>> 
> >>> When all file descriptors are closed, the data will be freed.
> >>> 
> >>> On that time, it will call an optional dev.release() callback,
> >>> responsible to free any other data struct that the driver allocated.
> >> 
> >> The patchset does not change this. It's not a question of the
> >> media_devnode struct either. That's not an issue.
> >> 
> >> The issue is rather what else can be accessed through the media device
> >> and other interfaces. As IOCTLs are not serialised with device removal
> >> (which now releases much of the data structures)
> > 
> > Huh? ioctls are serialized with struct device removal. The Driver core
> > warrants that.

Code references please.
 
> >> there's a high chance of accessing
> >> released memory (or mutexes that have been already destroyed). An example
> >> of that is here, stopping a running pipeline after unbinding the device.
> >> What happens there is that the media device is released whilst it's in
> >> use through the video device.
> >> 
> >> <URL:http://www.retiisi.org.uk/v4l2/tmp/media-ref-dmesg2.txt>
> > 
> > It is not clear from the logs what the driver tried to do, but
> > that sounds like a driver's bug, with was not prepared to properly
> > handle unbinds.
> > 
> > The problem here is that isp_video_release() is called by V4L2
> > release logic, and not by the MC one:
> > 
> > static const struct v4l2_file_operations isp_video_fops = {
> > 	.owner		= THIS_MODULE,
> > 	.open		= isp_video_open,
> > 	.release	= isp_video_release,
> > 	.poll		= vb2_fop_poll,
> > 	.unlocked_ioctl	= video_ioctl2,
> > 	.mmap		= vb2_fop_mmap,
> > };
> > 
> > It seems that the driver's logic allows it to be called before or
> > after destroying the MC.
> 
> Right isp_video_release() will definitely be called after driver is
> gone which means media device is gone and the device itself.

Certainly not after the driver is gone (as in the module being unloaded from 
memory), but it can be called after the device is unbound from the driver, 
yes.

> Both au0828 and em28xx have these release handlers. Neither one uses
> devm resource for their device structs.

And no driver exposing objects to userspace-accessible code paths should. I've 
been pointing at how devm_kzalloc() is abused for more than a year now, it's 
nice to see that people slowly start listening.

> Also, both em28xx and au0828 keep disconnected state and have logic
> to detect the state of the driver and device. em28xx holds reference
> to v4l2->ref

That's very, very wrong. The v4l2_device::ref field must *not* be touched by 
drivers. Acquiring and releasing references to v4l2_device instances must be 
done with v4l2_device_get() and v4l2_device_put(), and the structure has a 
release handler that drivers can use. Why do people write such horrible code 
that pokes at private fields ?

> and releases the reference in em28xx_v4l2_close() which is
> its v4l2_file_operations .release handler. It also makes sure to not
> touch device hardware if device is disconnected.
> 
> Also, media graph access is done only when it has a valid media_device.
> au0828 allocates media_device struct and it gets free'd when it does
> its unregister sequence. Subsequent calls will check if it is null.

This is very wrong too. Don't try to handle data structures being pulled from 
under the driver's feet at random times. At best you will end up with races. 
Data structures must instead be properly refcounted.

> It also does checks to see if media_device is registered or not in
> some cases.
> 
> isp_video_release() isn't safe to be called after isp device is gone,
> leave alone media_device. Since isp is a devm resource, it is long
> gone when device_release() release managed resources.
> 
> I agree with Mauro that this is a driver problem.

No. There *is* a driver problem caused by devm_kzalloc() usage, and that 
problem *must* be fixed, but the media device life time management is also 
completely broken in core code.

> Mauro and I did lot of work to get the USB drivers (em28xx and au0828) to
> handle disconnect and unbind cases even before the media controller support
> was added to them.
> 
> I think what needs to happen is:
> 
> 1. Remove devm use from omap3

Absolutely.

> 2. Make sure media graph isn't accessed after media_device is unregistered

No way. We need to access the graph from the release handlers of the 
userspace-exposed structures (videodev and possibly media_device). The 
media_device structure must *not* disappear at unregistration time.

> 3. Take reference to v4l2 device to be able to make sanity checks from
>    isp_video_release() to determine if media_device is still around and
>    then do stop stream etc. It has to keep state.
> 
> I agree with Mauro that this is a driver problem. Mauro and I did lot
> of work to get the USB drivers (em28xx and au0828) to handle disconnect
> and unbind cases even before the media controller support was added to
> them.
> 
> Please don't pursue this RFC series that makes mc-core changes until
> ompa3 driver problems are addressed. There is no need to change the
> core unless it is necessary.

It is necessary as has been explained countless times, and will become more 
and more necessary as media_device instances get shared between multiple 
drivers, which is currently attempted *without* reference counting.

> I would be happy to help, unfortunately I don't have a omap3 device
> to fix and test problems. I am unable to find any omap3 devices. The
> one I have isn't good.

-- 
Regards,

Laurent Pinchart

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Input]     [Video for Linux]     [Gstreamer Embedded]     [Mplayer Users]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]
  Powered by Linux