+ ipc-sem-sem_lock-with-hysteresis.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: ipc/sem: sem_lock with hysteresis
has been added to the -mm tree.  Its filename is
     ipc-sem-sem_lock-with-hysteresis.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/ipc-sem-sem_lock-with-hysteresis.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/ipc-sem-sem_lock-with-hysteresis.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/SubmitChecklist when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Manfred Spraul <manfred@xxxxxxxxxxxxxxxx>
Return-Path: <manfred@xxxxxxxxxxxxxxxx>
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on y.localdomain
X-Spam-Level: 
X-Spam-Status: No, score=-1.5 required=2.5 tests=BAYES_00,T_DKIM_INVALID
	autolearn=ham autolearn_force=no version=3.4.1
Received: from y.localdomain (localhost [127.0.0.1])
	by y.localdomain (8.14.9/8.14.9/Debian-4) with ESMTP id u5PHcSjA013494
	for <akpm@localhost>; Sat, 25 Jun 2016 10:38:29 -0700
X-Original-To: akpm@xxxxxxxxxxxxxxxxxxxxxxxx
Delivered-To: akpm@xxxxxxxxxxxxxxxxxxxxxxxx
Received: from mail.linuxfoundation.org [140.211.169.12]
	by y.localdomain with IMAP (fetchmail-6.3.26)
	for <akpm@localhost> (single-drop); Sat, 25 Jun 2016 10:38:29 -0700 (PDT)
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id F0D7B9D
	for <akpm@xxxxxxxxxxxxxxxxxxxxxxxx>; Sat, 25 Jun 2016 17:38:25 +0000 (UTC)
X-Greylist: whitelisted by SQLgrey-1.7.6
Received: from mail-lf0-f69.google.com (mail-lf0-f69.google.com [209.85.215.69])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id C9DFF21F
	for <akpm@xxxxxxxxxxxxxxxxxxxxxxxx>; Sat, 25 Jun 2016 17:38:24 +0000 (UTC)
Received: by mail-lf0-f69.google.com with SMTP id l184so94987549lfl.3
        for <akpm@xxxxxxxxxxxxxxxxxxxxxxxx>; Sat, 25 Jun 2016 10:38:24 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20130820;
        h=x-gm-message-state:dkim-signature:from:to:cc:subject:date
         :message-id:in-reply-to:references:delivered-to;
        bh=sEFU3vdxB2LcdX4M8JlJhPYqz/sl12SexaAlUndXyeY=;
        b=OcQRqNbQPwbdDNcNazahlTS0P2vlnp+0VICs1WO94HxOV31t5IbAqo6zdy2JK+d2P8
         vCcQ0tTMP1ymZcYrHKGTXPsooCg1qpABWl1EmDyaQ6vzZeRS6NyOx6XMygZY/4rkDoNT
         9VSf0JucQ6gvqS1IXFClMSl2V/4LJ53L3V/oTYncSF3V/t8i8WF8do6YwC5MrfWMbEpZ
         6eR/sBFEvz7Z9l8aGMSHnUZDiNgP75JDCuvCyaYGDG1nj/qzi3JDRPhxP+57u3La/Zb8
         zYM9c39n7K4L8EA579Px66VasjdPqQJ9T+EqkLVrmok1yy14w1pNSms7NkWW8X4TgzzF
         NuEw==
X-Gm-Message-State: ALyK8tLbYxKSZZDmVgloI1m4Q8qQIqvtu3iz19bDyQVDyE61yVwVGTVK+NMswuxdBhfruC4dp6h531izoE1C+n1KOAUR2rSIP1KMfcj/7kP36zdoJaWAOLvcc9e/dCfevnEqISotUcVfXo2dX9bWYbMpttApgUlG/TtI2D94vjzWaPSET6k2IM66zdQJPOkpS8N1MNdsGkonNJmJJpVlTGZXFaPUo4fYsxZYDjVie0z18p3vNGYl28dYpElUaQ1NrDQHs41IOZdK66LFxPxvkvwtv6WTHht8q/LLoEAaZCOMmE8QfDEQS3E1yia1iG4VaJzP/NM=
X-Received: by 10.28.88.206 with SMTP id m197mr3215991wmb.18.1466876303021;
        Sat, 25 Jun 2016 10:38:23 -0700 (PDT)
X-Received: by 10.28.88.206 with SMTP id m197mr3215972wmb.18.1466876302511;
        Sat, 25 Jun 2016 10:38:22 -0700 (PDT)
Received: from mail-wm0-x241.google.com (mail-wm0-x241.google.com. [2a00:1450:400c:c09::241])
        by mx.google.com with ESMTPS id bd7si15323610wjb.138.2016.06.25.10.38.22
        for <akpm@xxxxxxxxxxxxxxxxxxxx>
        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
        Sat, 25 Jun 2016 10:38:22 -0700 (PDT)
Received-SPF: pass (google.com: domain of manfred@xxxxxxxxxxxxxxxx designates 2a00:1450:400c:c09::241 as permitted sender) client-ip=2a00:1450:400c:c09::241;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@colorfullife-com.20150623.gappssmtp.com;
       spf=pass (google.com: domain of manfred@xxxxxxxxxxxxxxxx designates 2a00:1450:400c:c09::241 as permitted sender) smtp.mailfrom=manfred@xxxxxxxxxxxxxxxx
Received: by mail-wm0-x241.google.com with SMTP id a66so13527720wme.2
        for <akpm@xxxxxxxxxxxxxxxxxxxx>; Sat, 25 Jun 2016 10:38:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=colorfullife-com.20150623.gappssmtp.com; s=20150623;
        h=from:to:cc:subject:date:message-id:in-reply-to:references;
        bh=sEFU3vdxB2LcdX4M8JlJhPYqz/sl12SexaAlUndXyeY=;
        b=z4THjH/t/dojyynYgHIsnX2f6WCeDdYslNIFG1i1+6nljCh+ZeZprjokSscZy7aHba
         vLbDcem4T0fVBprW3OCGwR89RJ82BO8NyFOQoqlgK+kYseYM7OvckysBALl3ZZqzanPM
         Fd5Ma+CYAoTyvDjjQGCY+pivvmiAdItLaNCKb2gi6nNt7CoHUJMFIeqmxCdgiKiGbVs3
         wOBhRkPqnoNI4kUC9OyLi/V37dmAEZJpNzD8GSByjXJMnU5CsBKRVcYh0a0pFFWjFAyK
         4/Fb4iQNE2YM4apvcCgcBJWOYePR5C9E5s1+0cmtj/wTjbzHj6BE0spLG2OIE2ae/KrE
         UCkg==
X-Received: by 10.194.82.74 with SMTP id g10mr8706800wjy.11.1466876302111;
        Sat, 25 Jun 2016 10:38:22 -0700 (PDT)
Received: from localhost.localdomain (dslb-088-071-110-173.088.071.pools.vodafone-ip.de. [88.71.110.173])
        by smtp.googlemail.com with ESMTPSA id a4sm4630275wjq.40.2016.06.25.10.38.19
        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
        Sat, 25 Jun 2016 10:38:21 -0700 (PDT)
To: "H. Peter Anvin" <hpa@xxxxxxxxx>, Peter Zijlstra <peterz@xxxxxxxxxxxxx>,
        Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>,
        Davidlohr Bueso <dave@xxxxxxxxxxxx>
Cc: LKML <linux-kernel@xxxxxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>,
        Ingo Molnar <mingo@xxxxxxx>, 1vier1@xxxxxx,
        felixh@xxxxxxxxxxxxxxxxxxxxxxxx,
        Manfred Spraul <manfred@xxxxxxxxxxxxxxxx>
Subject: ipc/sem: sem_lock with hysteresis
Date: Sat, 25 Jun 2016 19:37:52 +0200
Message-Id: <1466876272-3824-3-git-send-email-manfred@xxxxxxxxxxxxxxxx>
X-Mailer: git-send-email 2.5.5
In-Reply-To: <1466876272-3824-2-git-send-email-manfred@xxxxxxxxxxxxxxxx>
References: <1466876272-3824-1-git-send-email-manfred@xxxxxxxxxxxxxxxx>
 <1466876272-3824-2-git-send-email-manfred@xxxxxxxxxxxxxxxx>
Delivered-To: akpm@xxxxxxxxxxxxxxxxxxxx

sysv sem has two lock modes: One with per-semaphore locks, one lock mode
with a single big lock for the whole array.
When switching from the per-semaphore locks to the big lock, all
per-semaphore locks must be scanned for ongoing operations.

The patch adds a hysteresis for switching from the big lock to the per
semaphore locks. This reduces how often the per-semaphore locks must
be scanned.

Passed stress testing with sem-scalebench.

Signed-off-by: Manfred Spraul <manfred@xxxxxxxxxxxxxxxx>
---
 include/linux/sem.h |  2 +-
 ipc/sem.c           | 89 +++++++++++++++++++++++++++++------------------------
 2 files changed, 49 insertions(+), 42 deletions(-)

diff --git a/include/linux/sem.h b/include/linux/sem.h
index d0efd6e..6fb3227 100644
--- a/include/linux/sem.h
+++ b/include/linux/sem.h
@@ -21,7 +21,7 @@ struct sem_array {
 	struct list_head	list_id;	/* undo requests on this array */
 	int			sem_nsems;	/* no. of semaphores in array */
 	int			complex_count;	/* pending complex operations */
-	bool			complex_mode;	/* no parallel simple ops */
+	int			complex_mode;	/* >0: no parallel simple ops */
 };
 
 #ifdef CONFIG_SYSVIPC
diff --git a/ipc/sem.c b/ipc/sem.c
index 538f43a..076b7c9 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -161,6 +161,13 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it);
 #define SEMOPM_FAST	64  /* ~ 372 bytes on stack */
 
 /*
+ * Switching from the mode suitable for simple ops
+ * to the mode for complex ops is costly. Therefore:
+ * use some hysteresis
+ */
+#define COMPLEX_MODE_ENTER 10
+
+/*
  * Locking:
  * a) global sem_lock() for read/write
  *	sem_undo.id_next,
@@ -269,17 +276,25 @@ static void sem_rcu_free(struct rcu_head *head)
 /*
  * Enter the mode suitable for non-simple operations:
  * Caller must own sem_perm.lock.
+ * Note:
+ * There is no leave complex mode function. Leaving
+ * happens in sem_lock, with some hysteresis.
  */
 static void complexmode_enter(struct sem_array *sma)
 {
 	int i;
 	struct sem *sem;
 
-	if (sma->complex_mode)  {
-		/* We are already in complex_mode. Nothing to do */
+	if (sma->complex_mode > 0)  {
+		/*
+		 * We are already in complex_mode.
+		 * Nothing to do, just increase
+		 * counter until we return to simple mode
+		 */
+		WRITE_ONCE(sma->complex_mode, COMPLEX_MODE_ENTER);
 		return;
 	}
-	WRITE_ONCE(sma->complex_mode, true);
+	WRITE_ONCE(sma->complex_mode, COMPLEX_MODE_ENTER);
 
 	/* We need a full barrier:
 	 * The write to complex_mode must be visible
@@ -294,27 +309,6 @@ static void complexmode_enter(struct sem_array *sma)
 }
 
 /*
- * Try to leave the mode that disallows simple operations:
- * Caller must own sem_perm.lock.
- */
-static void complexmode_tryleave(struct sem_array *sma)
-{
-	if (sma->complex_count)  {
-		/* Complex ops are sleeping.
-		 * We must stay in complex mode
-		 */
-		return;
-	}
-	/*
-	 * Immediately after setting complex_mode to false,
-	 * a simple op can start. Thus: all memory writes
-	 * performed by the current operation must be visible
-	 * before we set complex_mode to false.
-	 */
-	smp_store_release(&sma->complex_mode, false);
-}
-
-/*
  * If the request contains only one semaphore operation, and there are
  * no complex transactions pending, lock only the semaphore involved.
  * Otherwise, lock the entire semaphore array, since we either have
@@ -372,27 +366,42 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
 	ipc_lock_object(&sma->sem_perm);
 
 	if (sma->complex_count == 0) {
-		/* False alarm:
-		 * There is no complex operation, thus we can switch
-		 * back to the fast path.
-		 */
-		spin_lock(&sem->lock);
-		ipc_unlock_object(&sma->sem_perm);
-		return sops->sem_num;
-	} else {
-		/* Not a false alarm, thus complete the sequence for a
-		 * full lock.
+		/*
+		 * Check if fast path is possible:
+		 * There is no complex operation, check hysteresis
+		 * If 0, switch back to the fast path.
 		 */
-		complexmode_enter(sma);
-		return -1;
+		if (sma->complex_mode > 0) {
+			/* Note:
+			 * Immediately after setting complex_mode to 0,
+			 * a simple op could start.
+			 * The data it would access was written by the
+			 * previous owner of sem->sem_perm.lock, i.e
+			 * a release and an acquire memory barrier ago.
+			 * No need for another barrier.
+			 */
+			WRITE_ONCE(sma->complex_mode, sma->complex_mode-1);
+		}
+		if (sma->complex_mode == 0) {
+			spin_lock(&sem->lock);
+			ipc_unlock_object(&sma->sem_perm);
+			return sops->sem_num;
+		}
 	}
+	/*
+	 * Not a false alarm, full lock is required.
+	 * Since we are already in complex_mode (either because of waiting
+	 * complex ops or due to hysteresis), there is not need for a
+	 * complexmode_enter().
+	 */
+	WARN_ON(sma->complex_mode == 0);
+	return -1;
 }
 
 static inline void sem_unlock(struct sem_array *sma, int locknum)
 {
 	if (locknum == -1) {
 		unmerge_queues(sma);
-		complexmode_tryleave(sma);
 		ipc_unlock_object(&sma->sem_perm);
 	} else {
 		struct sem *sem = sma->sem_base + locknum;
@@ -544,7 +553,7 @@ static int newary(struct ipc_namespace *ns, struct ipc_params *params)
 	}
 
 	sma->complex_count = 0;
-	sma->complex_mode = true; /* dropped by sem_unlock below */
+	WRITE_ONCE(sma->complex_mode, COMPLEX_MODE_ENTER);
 	INIT_LIST_HEAD(&sma->pending_alter);
 	INIT_LIST_HEAD(&sma->pending_const);
 	INIT_LIST_HEAD(&sma->list_id);
@@ -2201,7 +2210,7 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it)
 	 * The proc interface isn't aware of sem_lock(), it calls
 	 * ipc_lock_object() directly (in sysvipc_find_ipc).
 	 * In order to stay compatible with sem_lock(), we must
-	 * enter / leave complex_mode.
+	 * enter complex_mode.
 	 */
 	complexmode_enter(sma);
 
@@ -2220,8 +2229,6 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it)
 		   sem_otime,
 		   sma->sem_ctime);
 
-	complexmode_tryleave(sma);
-
 	return 0;
 }
 #endif
-- 
2.5.5

Patches currently in -mm which might be from manfred@xxxxxxxxxxxxxxxx are

ipc-semc-fix-complex_count-vs-simple-op-race.patch
ipc-sem-sem_lock-with-hysteresis.patch

--
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]
  Powered by Linux