On 11/28/2010 02:35 PM, Daniel Vetter wrote:
On Sun, Nov 28, 2010 at 01:32:27PM +0100, Thomas Hellstrom wrote:
This is racy, in that the kref_get() can hit a zero refcount. I
think an ideal thing here would be to add a kref_get_unless_zero()
for this situation, that returns an error if the refcount was indeed
zero. I don't think that would violate the kref idea, since we still
never increase the refcount of a zeroed object, and the user needs
to provide the necessary synchronization to make sure the object
isn't gone while still the refcount is zero.
I've got curious rechecked the code and it all looks sane. The trick is
that all the userspace-visible names hold a ref to the underlying gem
object (including the flink name). If all userspace handles are destroyed
(save the flink name) the flink name gets destroyed, too (see
obj->handle_count). All this is independently protected with spinlocks
and it happens _before_ dropping the ref to the underlying bo (and taking
any necessary locks to do so). I don't see any races nor locking problems,
there.
-Daniel
Hi, Daniel!
I'm not saying that the current gem code is incorrect. In particular,
the gem name system allows upping the handle_count even if it's zero and
the name is just about to be destroyed. I see that more as a curiosity,
and gem would work just as well if it used kref_get_unless_zero on the
handle_count.
However this is a generic problem.
It appears in the general user-visible object case. For example,
currently gem needs to take the struct_mutex *before* unreferencing to
avoid hitting this problem with a racing mmap lookup (At least I think
that's why the struct_mutex is taken). If mmap were using
kref_get_unless_zero, and returning error on failure, the struct_mutex
would be needed only inside the kref release function, and that is what
Dave was referring to in his mail.
An identical problem appears in TTM where we need to take the
bdev::vm_lock around unreffing a ttm bo, and in the ttm_object
functionality were we also have a general lookup mechanism for stuff
like contexts, surfaces, user-visible fences or whatever.
In particular this also applies to the rcu case, were we cannot block
readers. If we were to move over all object lookup to use RCU locks,
we'd want to be completely 100% sure that once a refcount reaches zero,
it won't be upped again, and we can safely go ahead and use call_rcu for
destruction. Let's say this wasn't true, call_rcu could in principle be
called multiple times for the same object, and that isn't allowed until
the callback has been run...
And the simple solution to this problem is kref_get_unless_zero().
/Thomas
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel