On Mon, Sep 04, 2017 at 11:41:05AM +0300, Laurent Pinchart wrote: > Hi Daniel, > > On Monday, 4 September 2017 10:26:15 EEST Daniel Vetter wrote: > > On Thu, Aug 31, 2017 at 09:22:03PM +0200, Noralf Trønnes wrote: > > > Den 31.08.2017 14.59, skrev Laurent Pinchart: > > >> On Wednesday, 30 August 2017 20:18:49 EEST Daniel Vetter wrote: > > >>> On Wed, Aug 30, 2017 at 6:31 PM, Noralf Trønnes wrote: > > >>>> Den 28.08.2017 23.56, skrev Daniel Vetter: > > >>>>> On Mon, Aug 28, 2017 at 07:17:48PM +0200, Noralf Trønnes wrote: > > >>>>>> + > > >>>>>> + drm_fbdev_cma_dev_unplug(tdev->fbdev_cma); > > >>>>>> + drm_dev_unplug(&tdev->drm); > > >>>>>> + > > >>>>>> + /* Make sure framebuffer flushing is done */ > > >>>>>> + mutex_lock(&tdev->dirty_lock); > > >>>>>> + mutex_unlock(&tdev->dirty_lock); > > >>>>> > > >>>>> Is this really needed? Or, doesn't it just paper over a driver bug > > >>>>> you have already anyway, since native kms userspace can directly > > >>>>> call fb->funcs->dirty too, and you already protect against that. > > >>>>> > > >>>>> This definitely looks like the fbdev helper is leaking > > >>>>> implementation details to callers where it shouldn't do that. > > >>>> > > >>>> Flushing can happen while drm_dev_unplug() is called, and when we > > >>>> leave this function the device facing resources controlled by devres > > >>>> will be removed. Thus I have to make sure any such flushing is done > > >>>> before leaving so the next flush is stopped by the > > >>>> drm_dev_is_unplugged() check. I don't see any other way of ensuring > > >>>> that. > > >>>> > > >>>> I see now that I should move the call to > > >>>> drm_atomic_helper_shutdown() after drm_dev_unplug() to properly > > >>>> protect the pipe .enable/.disable callbacks. > > >>> > > >>> Hm, calling _shutdown when the hw is gone already won't end well. > > >>> Fundamentally this race exists for all use-cases, and I'm somewhat > > >>> leaning towards plugging it in the core. > > >>> > > >>> The general solution probably involves something that smells a lot > > >>> like srcu, i.e. at every possible entry point into a drm driver > > >>> (ioctl, fbdev, dma-buf sharing, everything really) we take that > > >>> super-cheap read-side look, and drop it when we leave. > > >> > > >> That's similar to what we plan to do in V4L2. The idea is to set a > > >> device removed flag at the beginning of the .remove() handler and wait > > >> for all pending operations to complete. The core will reject any new > > >> operation when the flag is set. To wait for completion, every entry > > >> point would increase a use count, and decrease it on exit. When the use > > >> count is decreased to 0 waiters will be woken up. This should solve the > > >> unplug/user race. > > > > > > Ah, such a simple solution, easy to understand and difficult to get wrong! > > > And it's even nestable, no danger of deadlocking. > > > > > > Maybe I can use it with tinydrm: > > > > > > * @dev_use: Tracks use of functions acessing the parent device. > > > * If it is zero, the device is gone. See ... > > > struct tinydrm_device { > > > atomic_t dev_use; > > > }; > > > > > > /** > > > * tinydrm_dev_enter - Enter device accessing function > > > * @tdev: tinydrm device > > > * > > > * This function protects against using a device and it's resources after > > > * it's removed. Should be called at the beginning of the function. > > > * > > > * Returns: > > > * False if the device is still present, true if it is gone. > > > */ > > > static inline bool tinydrm_dev_enter(struct tinydrm_device *tdev) > > > { > > > return !atomic_inc_not_zero(&tdev->dev_use); > > > } > > > > > > static inline void tinydrm_dev_exit(struct tinydrm_device *tdev) > > > { > > > atomic_dec(&tdev->dev_use); > > > } > > > > > > static inline bool tinydrm_is_unplugged(struct tinydrm_device *tdev) > > > { > > > bool ret = !atomic_read(&tdev->dev_use); > > > smp_rmb(); > > > return ret; > > > } > > > > > > > > > static int tinydrm_init(...) > > > { > > > /* initialize */ > > > > > > /* Set device is present */ > > > atomic_set(&tdev->dev_use, 1); > > > } > > > > > > static void tinydrm_unregister(...) > > > { > > > /* Set device gone */ > > > atomic_dec(&tdev->dev_use); > > > > > > /* Wait for all device facing functions to finish */ > > > while (!tinydrm_is_unplugged(tdev)) { > > > cond_resched(); > > > } > > > > > > /* proceed with unregistering */ > > > } > > > > > > static int mipi_dbi_fb_dirty(...) > > > { > > > if (tinydrm_dev_enter(tdev)) > > > return -ENODEV; > > > > > > /* flush framebuffer */ > > > > > > tinydrm_dev_exit(tdev); > > > } > > > > Yup, expect imo this should be done in drm core (as much as possible, i.e. > > for ioctl at least, standard sysfs), plus then enter/exit exported to > > drivers for their stuff. > > > > And it really needs to be srcu. For kms drivers the atomic_inc/dec wont > > matter, for render drivers it will show up (some of our ioctls are 100% > > lock less in the fastpath, not a single atomic op in there). > > Don't forget that we need to check the disconnect flag when entering ioctls to > return -ENODEV. I'm open to clever solutions, but I'd first aim for > correctness with a lock and then replace the internal implementation. As Noralf pointed out, we already check for drm_dev_is_unplugged(). Maybe not in all places, but that can be fixed. And you can't first make all ioctl slower and then fix it up, at least not spread over multiple patch series. I guess for developing, doing the simpler atomic counter first is ok. But the same patch series needs to move over to srcu at the end. But I'm not sure that's a good idea, since implementing this 100% correctly using your atomic_t idea means you implement half of srcu anyway. Otoh discovering all those races should be an interesting journey :-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel