On Mon, Sep 4, 2017 at 5:54 PM, Noralf Trønnes <noralf@xxxxxxxxxxx> wrote: > > Den 04.09.2017 17.20, skrev Daniel Vetter: >> >> On Mon, Sep 04, 2017 at 02:30:05PM +0200, Noralf Trønnes wrote: >>> >>> Den 04.09.2017 11.04, skrev Daniel Vetter: >>>> >>>> On Mon, Sep 04, 2017 at 11:41:05AM +0300, Laurent Pinchart wrote: >>>>> >>>>> Hi Daniel, >>>>> >>>>> On Monday, 4 September 2017 10:26:15 EEST Daniel Vetter wrote: >>>>>> >>>>>> On Thu, Aug 31, 2017 at 09:22:03PM +0200, Noralf Trønnes wrote: >>>>>>> >>>>>>> Den 31.08.2017 14.59, skrev Laurent Pinchart: >>>>>>>> >>>>>>>> On Wednesday, 30 August 2017 20:18:49 EEST Daniel Vetter wrote: >>>>>>>>> >>>>>>>>> On Wed, Aug 30, 2017 at 6:31 PM, Noralf Trønnes wrote: >>>>>>>>>> >>>>>>>>>> Den 28.08.2017 23.56, skrev Daniel Vetter: >>>>>>>>>>> >>>>>>>>>>> On Mon, Aug 28, 2017 at 07:17:48PM +0200, Noralf Trønnes wrote: >>>>>>>>>>>> >>>>>>>>>>>> + >>>>>>>>>>>> + drm_fbdev_cma_dev_unplug(tdev->fbdev_cma); >>>>>>>>>>>> + drm_dev_unplug(&tdev->drm); >>>>>>>>>>>> + >>>>>>>>>>>> + /* Make sure framebuffer flushing is done */ >>>>>>>>>>>> + mutex_lock(&tdev->dirty_lock); >>>>>>>>>>>> + mutex_unlock(&tdev->dirty_lock); >>>>>>>>>>> >>>>>>>>>>> Is this really needed? Or, doesn't it just paper over a driver >>>>>>>>>>> bug >>>>>>>>>>> you have already anyway, since native kms userspace can directly >>>>>>>>>>> call fb->funcs->dirty too, and you already protect against that. >>>>>>>>>>> >>>>>>>>>>> This definitely looks like the fbdev helper is leaking >>>>>>>>>>> implementation details to callers where it shouldn't do that. >>>>>>>>>> >>>>>>>>>> Flushing can happen while drm_dev_unplug() is called, and when we >>>>>>>>>> leave this function the device facing resources controlled by >>>>>>>>>> devres >>>>>>>>>> will be removed. Thus I have to make sure any such flushing is >>>>>>>>>> done >>>>>>>>>> before leaving so the next flush is stopped by the >>>>>>>>>> drm_dev_is_unplugged() check. I don't see any other way of >>>>>>>>>> ensuring >>>>>>>>>> that. >>>>>>>>>> >>>>>>>>>> I see now that I should move the call to >>>>>>>>>> drm_atomic_helper_shutdown() after drm_dev_unplug() to properly >>>>>>>>>> protect the pipe .enable/.disable callbacks. >>>>>>>>> >>>>>>>>> Hm, calling _shutdown when the hw is gone already won't end well. >>>>>>>>> Fundamentally this race exists for all use-cases, and I'm somewhat >>>>>>>>> leaning towards plugging it in the core. >>>>>>>>> >>>>>>>>> The general solution probably involves something that smells a lot >>>>>>>>> like srcu, i.e. at every possible entry point into a drm driver >>>>>>>>> (ioctl, fbdev, dma-buf sharing, everything really) we take that >>>>>>>>> super-cheap read-side look, and drop it when we leave. >>>>>>>> >>>>>>>> That's similar to what we plan to do in V4L2. The idea is to set a >>>>>>>> device removed flag at the beginning of the .remove() handler and >>>>>>>> wait >>>>>>>> for all pending operations to complete. The core will reject any new >>>>>>>> operation when the flag is set. To wait for completion, every entry >>>>>>>> point would increase a use count, and decrease it on exit. When the >>>>>>>> use >>>>>>>> count is decreased to 0 waiters will be woken up. This should solve >>>>>>>> the >>>>>>>> unplug/user race. >>>>>>> >>>>>>> Ah, such a simple solution, easy to understand and difficult to get >>>>>>> wrong! >>>>>>> And it's even nestable, no danger of deadlocking. >>>>>>> >>>>>>> Maybe I can use it with tinydrm: >>>>>>> >>>>>>> * @dev_use: Tracks use of functions acessing the parent device. >>>>>>> * If it is zero, the device is gone. See ... >>>>>>> struct tinydrm_device { >>>>>>> atomic_t dev_use; >>>>>>> }; >>>>>>> >>>>>>> /** >>>>>>> * tinydrm_dev_enter - Enter device accessing function >>>>>>> * @tdev: tinydrm device >>>>>>> * >>>>>>> * This function protects against using a device and it's resources >>>>>>> after >>>>>>> * it's removed. Should be called at the beginning of the function. >>>>>>> * >>>>>>> * Returns: >>>>>>> * False if the device is still present, true if it is gone. >>>>>>> */ >>>>>>> static inline bool tinydrm_dev_enter(struct tinydrm_device *tdev) >>>>>>> { >>>>>>> return !atomic_inc_not_zero(&tdev->dev_use); >>>>>>> } >>>>>>> >>>>>>> static inline void tinydrm_dev_exit(struct tinydrm_device *tdev) >>>>>>> { >>>>>>> atomic_dec(&tdev->dev_use); >>>>>>> } >>>>>>> >>>>>>> static inline bool tinydrm_is_unplugged(struct tinydrm_device *tdev) >>>>>>> { >>>>>>> bool ret = !atomic_read(&tdev->dev_use); >>>>>>> smp_rmb(); >>>>>>> return ret; >>>>>>> } >>>>>>> >>>>>>> >>>>>>> static int tinydrm_init(...) >>>>>>> { >>>>>>> /* initialize */ >>>>>>> >>>>>>> /* Set device is present */ >>>>>>> atomic_set(&tdev->dev_use, 1); >>>>>>> } >>>>>>> >>>>>>> static void tinydrm_unregister(...) >>>>>>> { >>>>>>> /* Set device gone */ >>>>>>> atomic_dec(&tdev->dev_use); >>>>>>> >>>>>>> /* Wait for all device facing functions to finish */ >>>>>>> while (!tinydrm_is_unplugged(tdev)) { >>>>>>> cond_resched(); >>>>>>> } >>>>>>> >>>>>>> /* proceed with unregistering */ >>>>>>> } >>>>>>> >>>>>>> static int mipi_dbi_fb_dirty(...) >>>>>>> { >>>>>>> if (tinydrm_dev_enter(tdev)) >>>>>>> return -ENODEV; >>>>>>> >>>>>>> /* flush framebuffer */ >>>>>>> >>>>>>> tinydrm_dev_exit(tdev); >>>>>>> } >>>>>> >>>>>> Yup, expect imo this should be done in drm core (as much as possible, >>>>>> i.e. >>>>>> for ioctl at least, standard sysfs), plus then enter/exit exported to >>>>>> drivers for their stuff. >>>>>> >>>>>> And it really needs to be srcu. For kms drivers the atomic_inc/dec >>>>>> wont >>>>>> matter, for render drivers it will show up (some of our ioctls are >>>>>> 100% >>>>>> lock less in the fastpath, not a single atomic op in there). >>>>> >>>>> Don't forget that we need to check the disconnect flag when entering >>>>> ioctls to >>>>> return -ENODEV. I'm open to clever solutions, but I'd first aim for >>>>> correctness with a lock and then replace the internal implementation. >>>> >>>> As Noralf pointed out, we already check for drm_dev_is_unplugged(). >>>> Maybe >>>> not in all places, but that can be fixed. >>>> >>>> And you can't first make all ioctl slower and then fix it up, at least >>>> not >>>> spread over multiple patch series. I guess for developing, doing the >>>> simpler atomic counter first is ok. But the same patch series needs to >>>> move over to srcu at the end. >>> >>> I've never used srcu/rcu before, care to explain a little bit more? >>> Is it drm_device we are protecting here? >>> >>> How can srcu be better for the fast path you're talking about if it's >>> half of srcu (assuming that makes srcu slower)? >> >> We're just protecting drm_device->unplugged with srcu, but the great thing >> is that srcu_read_lock/unlock are extremely cheap, and still guarantee >> that we can fully synchronize with writers. > > > Ok, so what makes this cheap is this_cpu_inc(), which on modern cpu's is > a single fast instruction? Yup. Atomics, locks and stuff like that usually require that the cpu flushes it entire pipeline, which is really expensive. > static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp) > { > int retval; > > retval = __srcu_read_lock(sp); > rcu_lock_acquire(&(sp)->dep_map); > return retval; > } > > int __srcu_read_lock(struct srcu_struct *sp) > { > int idx; > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > this_cpu_inc(sp->sda->srcu_lock_count[idx]); > smp_mb(); /* B */ /* Avoid leaking the critical section. */ > return idx; > } > >> Here's the rough sketch: >> >> /* srcu is real cheap on the read side, we can have one for all of drm >> DEFINE_STATIC_SRCU(drm_unplug_srcu); >> >> drm_dev_enter() >> { >> srcu_read_lock(&drm_srcu); >> } > > > When entering I want to check at the same time. Is this OK? > > bool drm_dev_enter(struct drm_device *dev) > { > srcu_read_lock(&drm_srcu); > if (drm_dev_is_unplugged(dev)) { > srcu_read_unlock(&drm_srcu); > return false; > } > > return true; > } Yeah that makes sense. Still please keep the locking check in _is_unplugged() (but maybe we want it to be lockdep_assert_held, so it's compiled away to nothing for production builds). -Daniel > > Noralf. > > >> drm_dev_exit() >> { >> srcu_read_unlock(&drm_srcu); >> } >> >> drm_dev_is_unplugged() >> { >> /* checking unplugged state outside of holding the srcu read side >> * lock is racy, catch this. */ >> WARN_ON(!srcu_read_lock_head(&drm_unplug_srcu)); >> >> /* with rcu we can demote this to a simple read, no need for any >> * atomic or memory barries >> return dev->unplugged; >> } >> >> drm_dev_unplug() >> { >> dev->unplugged = true; >> >> /* This is were the real magic is. After it finished any critical >> * read section is guaranteed to see the new value of ->unplugged, >> * and any critical section which might still have seen the old >> * value of ->unplugged is guaranteed to have finished. >> * It also takes like forever to complete, but that's kinda the >> * point. */ >> synchronize_srcu(&drm_unplug_srcu); >> } >> >> Excercise for the reader is poking holes into the simpler atomic_t based >> approach and highlighting all the races. You probably need to reach the >> complexity of srcu until it's really race free (and fast). >> >> Oh, one more: please make sure you enable CONFIG_PROVE_LOCKING and >> especilly all the rcu options in there when working on this, to make sure >> we catch all the deadlocks and bugs. This is some really tricky locking >> stuff. >> >> Cheers, Daniel > > -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel