Re: [PATCH -v5][RFC]: mutex: implement adaptive spinning

Gregory Haskins <ghaskins@xxxxxxxxxx> · Wed, 07 Jan 2009 17:28:12 -0500

Andi Kleen wrote:
>> I appreciate this is sample code, but using __get_user() on
>> non-userspace pointers messes up architectures which have separate
>> user/kernel spaces (eg the old 4G/4G split for x86-32).  Do we have an
>> appropriate function for kernel space pointers? 
>>     
>
> probe_kernel_address().
>
> But it's slow.
>
> -Andi
>
>   

Can I ask a simple question in light of all this discussion? 

"Is get_task_struct() really that bad?"

I have to admit you guys have somewhat lost me on some of the more
recent discussion, so its probably just a case of being naive on my
part...but this whole thing seems to have become way more complex than
it needs to be.  Lets boil this down to the core requirements:  We need
to know if the owner task is still running somewhere in the system as a
predicate to whether we should sleep or spin, period.  Now the question
is how to do that.

The get/put task is the obvious answer to me (as an aside, we looked at
task->oncpu rather than the rq->curr stuff which I believe was better),
and I am inclined to think that is perfectly reasonable way to do this: 
After all, even if acquiring a reference is somewhat expensive (which I
don't really think it is on a modern processor), we are already in the
slowpath as it is and would sleep otherwise.

Steve proposed a really cool trick with RCU since we know that the task
cannot release while holding the lock, and the pointer cannot go away
without waiting for a grace period.  It turned out to introduce latency
side-effects so it  ultimately couldn't be used (and this was in no way
a knock against RCU or you, Paul..just wasn't the right tool for the job
it turned out).

Ok, so onto other ideas.  What if we simply look at something like a
scheduling sequence id.  If we know (within the wait-lock) that task X
is the owner and its on CPU A, then we can simply monitor if A context
switches.  Having something like rq[A]->seq++ every time we schedule()
would suffice and you wouldnt need to hold a task reference...just note
A=X->cpu from inside the wait-lock.  I guess the downside there is
putting that extra increment in the schedule() hotpath even if no-one
cares, but I would surmise that should be reasonably cheap when no-one
is pulling the cacheline around other than A (i.e. no observers).

But anyway, my impression from observing the direction this discussion
has taken is that it is being way way over optimized before we even know
if a) the adaptive stuff helps, and b) the get/put-ref hurts.  Food for
thought.

-Greg

Attachment:
signature.asc

Description: OpenPGP digital signature