Re: [PATCH -rt] ipc/sem: Rework semaphore wakeups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/14/2011 09:23 PM, Peter Zijlstra wrote:
On Wed, 2011-09-14 at 20:48 +0200, Manfred Spraul wrote:
The code does:

      spin_lock()
      preempt_disable();
      usually_very_simple_but_worstcase_O_2
      spin_unlock()
      usually_very_simple_but_worstcase_O_1
      preempt_enable();

with your change, it becomes:

      spin_lock()
      usually_very_simple_but_worstcase_O_2
      usually_very_simple_but_worstcase_O_1
      spin_unlock()

The complex ops remain unchanged, they are still under a lock.
preemptible lock (aka pi-mutex) on -rt, so no weird latencies.
But the change means that more operations are under spin_lock().
Acutally for a large SMP system with a simple semaphore operation, the wake_up_process() takes longer than the semaphore operation.
And for some databases, contention on the spin_lock() is an issue.


What about removing the preempt_disable?
It's only there to cover a rare race on uniprocessor preempt systems.
(a task is woken up simultaneously due to timeout of semtimedop() and a
true wakeup)

Then fix the that race - something like the attached patch [obviously
buggy - see the fixme]
sched_yield() is always a bug, as is it here. Its an life-lock if the
woken task is of higher priority than the waking task. A higher prio
FIFO task calling sched_yield() in a loop is just that, a loop, starving
the lower prio waker.

If you've got enough medium prio tasks around to occupy all other cpus,
you're got indefinite priority inversion, so even on smp its a problem.

But yeah its not the prettiest of solutions but it works.. see that
other patch with the wake-list stuff for something that ought to work
for both rt and mainline (except of course it doesn't actually work).
Wake lists are definitively the better approach.
[let's continue in that thread]

--
    Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux