On Wed, 2013-04-17 at 15:23 -0400, Waiman Long wrote: > The current mutex spinning code (with MUTEX_SPIN_ON_OWNER option turned > on) allow multiple tasks to spin on a single mutex concurrently. A > potential problem with the current approach is that when the mutex > becomes available, all the spinning tasks will try to acquire the > mutex more or less simultaneously. As a result, there will be a lot of > cacheline bouncing especially on systems with a large number of CPUs. > > This patch tries to reduce this kind of contention by putting the > mutex spinners into a queue so that only the first one in the queue > will try to acquire the mutex. This will reduce contention and allow > all the tasks to move forward faster. > > The queuing of mutex spinners is done using an MCS lock based > implementation which will further reduce contention on the mutex > cacheline than a similar ticket spinlock based implementation. This > patch will add a new field into the mutex data structure for holding > the MCS lock. This expands the mutex size by 8 bytes for 64-bit system > and 4 bytes for 32-bit system. This overhead will be avoid if the > MUTEX_SPIN_ON_OWNER option is turned off. > > The following table shows the jobs per minute (JPM) scalability data > on an 8-node 80-core Westmere box with a 3.7.10 kernel. The numactl > command is used to restrict the running of the fserver workloads to > 1/2/4/8 nodes with hyperthreading off. [...] > > The short workload is the only one that shows a decline in performance > probably due to the spinner locking and queuing overhead. > > Signed-off-by: Waiman Long <Waiman.Long@xxxxxx> > Acked-by: Rik van Riel <riel@xxxxxxxxxx> Reviewed-by: Davidlohr Bueso <davidlohr.bueso@xxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html