On 03/11/2014 06:45 AM, Ingo Molnar wrote:
* Peter Zijlstra<peterz@xxxxxxxxxxxxx> wrote:
Hi Waiman,
I promised you this series a number of days ago; sorry for the delay
I've been somewhat unwell :/
That said, these few patches start with a (hopefully) simple and
correct form of the queue spinlock, and then gradually build upon
it, explaining each optimization as we go.
Having these optimizations as separate patches helps twofold;
firstly it makes one aware of which exact optimizations were done,
and secondly it allows one to proove or disprove any one step;
seeing how they should be mostly identity transforms.
The resulting code is near to what you posted I think; however it
has one atomic op less in the pending wait-acquire case for NR_CPUS
!= huge. It also doesn't do lock stealing; its still perfectly fair
afaict.
Have I missed any tricks from your code?
Waiman, you indicated in the other thread that these look good to you,
right? If so then I can queue them up so that they form a base for
further work.
It would be nice to have per patch performance measurements though ...
this split-up structure really enables that rather nicely.
Thanks,
Ingo
As said by Peter, I haven't reviewed his change yet. The patch I am
working on has an optimization that is similar to PeterZ's small NR_CPUS
change. Except that I do a single atomic short integer write to switch
the bits instead of 2 byte write. However, this code seems to have some
problem working with the lockref code and I had panic happening in
fs/dcache.c. So I am investigating that issue.
I am also trying to revise the PV support to be similar to what is
currently done in the PV ticketlock code. That is why I am kind of
silent this past week.
-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html