On Wed, Jul 24, 2013 at 11:02:02AM +0200, Daniel Vetter wrote: > On Wed, Jul 24, 2013 at 8:04 AM, Daniel Vetter <daniel.vetter@xxxxxxxx> wrote: > > This is the 2nd attempt, I've always been a bit dissatisified with the > > tricky nature of the first one: > > > > http://lists.freedesktop.org/archives/dri-devel/2012-July/025451.html > > > > The issue is that the flink ioctl can race with calling gem_close on > > the last gem handle. In that case we'll end up with a zero handle > > count, but an flink name (and it's corresponding reference). Which > > results in a neat space leak. > > > > In my first attempt I've solved this by rechecking the handle count. > > But fundamentally the issue is that ->handle_count isn't your usual > > refcount - it can be resurrected from 0 among other things. > > > > For those special beasts atomic_t often suggest way more ordering that > > it actually guarantees. To prevent being tricked by those hairy > > semantics take the easy way out and simply protect the handle with the > > existing dev->object_name_lock. > > > > With that change implemented it's dead easy to fix the flink vs. gem > > close reace: When we try to create the name we simply have to check > > whether there's still officially a gem handle around and if not refuse > > to create the flink name. Since the handle count decrement and flink > > name destruction is now also protected by that lock the reace is gone > > and we can't ever leak the flink reference again. > > > > Outside of the drm core only the exynos driver looks at the handle > > count, and tbh I have no idea why (it's just for debug dmesg output > > luckily). > > > > I've considered inlining the drm_gem_object_handle_free, but I plan to > > add more name-like things (like the exported dma_buf) to this scheme, > > so it's clearer to leave the handle freeing in its own function. > > > > This is exercised by the new gem_flink_race i-g-t testcase, which on > > my snb leaks gem objects at a rate of roughly 1k objects/s. > > That's actually incorrect since the leak I've found is just a race in > the drm/i915 object tracking. So I need to go back to the drawing > board and figure out which are the ghosts and which the dragons here. > > I've turned that testcase into an exercise for "drm/gem: completely > close gem_open vs. gem_close races", but that race only results in > userspace seeing different flink names for the same object. And that > only happens if userspace is racy already. > > For this patch here I still think there's an issue, but I seriously > need to restart my brain first and flush out the bogons with some > coffee before I try again ;-) Ok, I've written a second subtest now, and the race is indeed there and I've managed to leak a few objects. It's rather hard to hit though, I get a leak for roughly ever 1M attempts to provoke the race. But I'm rather convinced now that the leak is indeed there ;-) So I think the patch as-is stands as correct and required to block off evil usespace. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel