Hi Thomas, On Sun, Nov 28, 2010 at 03:19:50PM +0100, Thomas Hellstrom wrote: > I'm not saying that the current gem code is incorrect. In > particular, the gem name system allows upping the handle_count even > if it's zero and the name is just about to be destroyed. I see that > more as a curiosity, and gem would work just as well if it used > kref_get_unless_zero on the handle_count. Yeah, noticed that, too. But that's just userspace racing against itself, so no problem. > However this is a generic problem. > It appears in the general user-visible object case. For example, > currently gem needs to take the struct_mutex *before* unreferencing > to avoid hitting this problem with a racing mmap lookup (At least I > think that's why the struct_mutex is taken). If mmap were using > kref_get_unless_zero, and returning error on failure, the > struct_mutex would be needed only inside the kref release function, > and that is what Dave was referring to in his mail. Indeed, I haven't noticed that. drm/i915 doesn't hold a ref to the underlying bo for the offset_hash, which requires the dev->struct_mutex around unref. An actual mmap gets itself a ref (protected by dev->struct_mutex) so from there on its all fine. I think the offset_hash should hold a reference too, which gets destroyed when handle_count reaches 0 (like the flink name). Haven't checked what ttm does for this case. I think having a pointer without an accompanying reference is a bug, i.e. I've jotted this down on my list of things to fix. Then we could move the mutex taking inside the actual free function. > In particular this also applies to the rcu case, were we cannot > block readers. If we were to move over all object lookup to use RCU > locks, we'd want to be completely 100% sure that once a refcount > reaches zero, it won't be upped again, and we can safely go ahead > and use call_rcu for destruction. Let's say this wasn't true, > call_rcu could in principle be called multiple times for the same > object, and that isn't allowed until the callback has been run... Well, if the lookup structure has a reference to the underlying object we only need to defer the unref till all readers are gone with call_rcu. > And the simple solution to this problem is kref_get_unless_zero(). IMHO kref is for kernel references. And shouldn't be abused for userspace lookup with strange semantics. So kref_get_unless_zero sounds like a very bad idea hiding brittle locking semantics at best, bugs at worst. Cheers, Daniel -- Daniel Vetter Mail: daniel@xxxxxxxx Mobile: +41 (0)79 365 57 48 _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel