Den 04.09.2017 11.04, skrev Daniel Vetter:
On Mon, Sep 04, 2017 at 11:41:05AM +0300, Laurent Pinchart wrote:
Hi Daniel,
On Monday, 4 September 2017 10:26:15 EEST Daniel Vetter wrote:
On Thu, Aug 31, 2017 at 09:22:03PM +0200, Noralf Trønnes wrote:
Den 31.08.2017 14.59, skrev Laurent Pinchart:
On Wednesday, 30 August 2017 20:18:49 EEST Daniel Vetter wrote:
On Wed, Aug 30, 2017 at 6:31 PM, Noralf Trønnes wrote:
Den 28.08.2017 23.56, skrev Daniel Vetter:
On Mon, Aug 28, 2017 at 07:17:48PM +0200, Noralf Trønnes wrote:
+
+ drm_fbdev_cma_dev_unplug(tdev->fbdev_cma);
+ drm_dev_unplug(&tdev->drm);
+
+ /* Make sure framebuffer flushing is done */
+ mutex_lock(&tdev->dirty_lock);
+ mutex_unlock(&tdev->dirty_lock);
Is this really needed? Or, doesn't it just paper over a driver bug
you have already anyway, since native kms userspace can directly
call fb->funcs->dirty too, and you already protect against that.
This definitely looks like the fbdev helper is leaking
implementation details to callers where it shouldn't do that.
Flushing can happen while drm_dev_unplug() is called, and when we
leave this function the device facing resources controlled by devres
will be removed. Thus I have to make sure any such flushing is done
before leaving so the next flush is stopped by the
drm_dev_is_unplugged() check. I don't see any other way of ensuring
that.
I see now that I should move the call to
drm_atomic_helper_shutdown() after drm_dev_unplug() to properly
protect the pipe .enable/.disable callbacks.
Hm, calling _shutdown when the hw is gone already won't end well.
Fundamentally this race exists for all use-cases, and I'm somewhat
leaning towards plugging it in the core.
The general solution probably involves something that smells a lot
like srcu, i.e. at every possible entry point into a drm driver
(ioctl, fbdev, dma-buf sharing, everything really) we take that
super-cheap read-side look, and drop it when we leave.
That's similar to what we plan to do in V4L2. The idea is to set a
device removed flag at the beginning of the .remove() handler and wait
for all pending operations to complete. The core will reject any new
operation when the flag is set. To wait for completion, every entry
point would increase a use count, and decrease it on exit. When the use
count is decreased to 0 waiters will be woken up. This should solve the
unplug/user race.
Ah, such a simple solution, easy to understand and difficult to get wrong!
And it's even nestable, no danger of deadlocking.
Maybe I can use it with tinydrm:
* @dev_use: Tracks use of functions acessing the parent device.
* If it is zero, the device is gone. See ...
struct tinydrm_device {
atomic_t dev_use;
};
/**
* tinydrm_dev_enter - Enter device accessing function
* @tdev: tinydrm device
*
* This function protects against using a device and it's resources after
* it's removed. Should be called at the beginning of the function.
*
* Returns:
* False if the device is still present, true if it is gone.
*/
static inline bool tinydrm_dev_enter(struct tinydrm_device *tdev)
{
return !atomic_inc_not_zero(&tdev->dev_use);
}
static inline void tinydrm_dev_exit(struct tinydrm_device *tdev)
{
atomic_dec(&tdev->dev_use);
}
static inline bool tinydrm_is_unplugged(struct tinydrm_device *tdev)
{
bool ret = !atomic_read(&tdev->dev_use);
smp_rmb();
return ret;
}
static int tinydrm_init(...)
{
/* initialize */
/* Set device is present */
atomic_set(&tdev->dev_use, 1);
}
static void tinydrm_unregister(...)
{
/* Set device gone */
atomic_dec(&tdev->dev_use);
/* Wait for all device facing functions to finish */
while (!tinydrm_is_unplugged(tdev)) {
cond_resched();
}
/* proceed with unregistering */
}
static int mipi_dbi_fb_dirty(...)
{
if (tinydrm_dev_enter(tdev))
return -ENODEV;
/* flush framebuffer */
tinydrm_dev_exit(tdev);
}
Yup, expect imo this should be done in drm core (as much as possible, i.e.
for ioctl at least, standard sysfs), plus then enter/exit exported to
drivers for their stuff.
And it really needs to be srcu. For kms drivers the atomic_inc/dec wont
matter, for render drivers it will show up (some of our ioctls are 100%
lock less in the fastpath, not a single atomic op in there).
Don't forget that we need to check the disconnect flag when entering ioctls to
return -ENODEV. I'm open to clever solutions, but I'd first aim for
correctness with a lock and then replace the internal implementation.
As Noralf pointed out, we already check for drm_dev_is_unplugged(). Maybe
not in all places, but that can be fixed.
And you can't first make all ioctl slower and then fix it up, at least not
spread over multiple patch series. I guess for developing, doing the
simpler atomic counter first is ok. But the same patch series needs to
move over to srcu at the end.
I've never used srcu/rcu before, care to explain a little bit more?
Is it drm_device we are protecting here?
How can srcu be better for the fast path you're talking about if it's
half of srcu (assuming that makes srcu slower)?
Noralf.
But I'm not sure that's a good idea, since implementing this 100%
correctly using your atomic_t idea means you implement half of srcu
anyway. Otoh discovering all those races should be an interesting journey
:-)
-Daniel
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel