On Thu, Feb 28, 2013 at 1:14 PM, Rik van Riel <riel@xxxxxxxxxx> wrote: > > I have modified one of the semop tests to use multiple semaphores. Ooh yeah. This shows contention quite nicely. And it's all from ipc_lock, and looking at the top-10 loffenders of the profile: 43.01% semop-multi [kernel.kallsyms] [k] _raw_spin_lock ... 4.73% semop-multi [kernel.kallsyms] [k] avc_has_perm_flags 4.52% semop-multi [kernel.kallsyms] [k] ipc_has_perm.isra.21 ... 2.43% semop-multi [kernel.kallsyms] [k] ipcperms The 43% isn't actually all that interesting, it just shows that there is contention and we're waiting for other user. Yes, we waste almost half the CPU time on locking, but ignore that for a moment. The "more than 10% of the total time is spent in ipc permission code" *is* the interesting part. Because that 10%+ is actually more like 20% if you ignore the "wait for lock" part. And it's all done *inside* the lock. In other words, I can pretty much guarantee that the contention will go down a lot if we just move the security check outside the spinlock. According to the above numbers, we're currently spending basically 1/5th of our remaining CPU resources serialized for absolutely no good reason. THAT is the kind of thing we shouldn't do. The rest of the big offenders seem to be mostly done outside the spinlock, although it's hard to tell how much of the 10% of sys_semtimedop() iis also under the lock. There's probably other things there than just the permission checking. I'm not seeing any real reason the permission checking couldn't be done just under the RCU lock, before we get the spinlock. Except for the fact that the "helper" routines in ipc/util.c are written the way they are, so it's a layering violation. But I really think that would be a *reasonably* low-hanging fruit thing to do. Changing the locking itself to be more fine-grained, and doing it across many different ipc semaphores would be a major pain. So I do suspect that the work Michel Lespinasse did is probably worth doing anyway in addition to at least trying to fix the horrible lack of scalability of the code a bit. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html