Re: [PATCH RFC 01/26] netfilter: Replace spin_unlock_wait() with lock/unlock pair

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/03/2017 07:14 PM, Paul E. McKenney wrote:
On Mon, Jul 03, 2017 at 10:39:49AM -0400, Alan Stern wrote:
On Sat, 1 Jul 2017, Manfred Spraul wrote:

As we want to remove spin_unlock_wait() and replace it with explicit
spin_lock()/spin_unlock() calls, we can use this to simplify the
locking.

In addition:
- Reading nf_conntrack_locks_all needs ACQUIRE memory ordering.
- The new code avoids the backwards loop.

Only slightly tested, I did not manage to trigger calls to
nf_conntrack_all_lock().

Fixes: b16c29191dc8
Signed-off-by: Manfred Spraul <manfred@xxxxxxxxxxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>
Cc: Sasha Levin <sasha.levin@xxxxxxxxxx>
Cc: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
Cc: netfilter-devel@xxxxxxxxxxxxxxx
---
  net/netfilter/nf_conntrack_core.c | 44 +++++++++++++++++++++------------------
  1 file changed, 24 insertions(+), 20 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index e847dba..1193565 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -96,19 +96,24 @@ static struct conntrack_gc_work conntrack_gc_work;
void nf_conntrack_lock(spinlock_t *lock) __acquires(lock)
  {
+	/* 1) Acquire the lock */
  	spin_lock(lock);
-	while (unlikely(nf_conntrack_locks_all)) {
-		spin_unlock(lock);
- /*
-		 * Order the 'nf_conntrack_locks_all' load vs. the
-		 * spin_unlock_wait() loads below, to ensure
-		 * that 'nf_conntrack_locks_all_lock' is indeed held:
-		 */
-		smp_rmb(); /* spin_lock(&nf_conntrack_locks_all_lock) */
-		spin_unlock_wait(&nf_conntrack_locks_all_lock);
-		spin_lock(lock);
-	}
+	/* 2) read nf_conntrack_locks_all, with ACQUIRE semantics */
+	if (likely(smp_load_acquire(&nf_conntrack_locks_all) == false))
+		return;
As far as I can tell, this read does not need to have ACQUIRE
semantics.

You need to guarantee that two things can never happen:

     (1) We read nf_conntrack_locks_all == false, and this routine's
	critical section for nf_conntrack_locks[i] runs after the
	(empty) critical section for that lock in
	nf_conntrack_all_lock().

     (2) We read nf_conntrack_locks_all == true, and this routine's
	critical section for nf_conntrack_locks_all_lock runs before
	the critical section in nf_conntrack_all_lock().
I was looking at nf_conntrack_all_unlock:
There is a smp_store_release() - which memory barrier does this pair with?

nf_conntrack_all_unlock()
    <arbitrary writes>
    smp_store_release(a, false)
    spin_unlock(b);

nf_conntrack_lock()
    spin_lock(c);
    xx=read_once(a)
    if (xx==false)
        return
    <arbitrary read>

In fact, neither one can happen even if smp_load_acquire() is replaced
with READ_ONCE().  The reason is simple enough, using this property of
spinlocks:

	If critical section CS1 runs before critical section CS2 (for
	the same lock) then: (a) every write coming before CS1's
	spin_unlock() will be visible to any read coming after CS2's
	spin_lock(), and (b) no write coming after CS2's spin_lock()
	will be visible to any read coming before CS1's spin_unlock().
Does this apply? The locks are different.
Thus for (1), assuming the critical sections run in the order mentioned
above, since nf_conntrack_all_lock() writes to nf_conntrack_locks_all
before releasing nf_conntrack_locks[i], and since nf_conntrack_lock()
acquires nf_conntrack_locks[i] before reading nf_conntrack_locks_all,
by (a) the read will always see the write.

Similarly for (2), since nf_conntrack_all_lock() acquires
nf_conntrack_locks_all_lock before writing to nf_conntrack_locks_all,
and since nf_conntrack_lock() reads nf_conntrack_locks_all before
releasing nf_conntrack_locks_all_lock, by (b) the read cannot see the
write.
And the Linux kernel memory model (https://lwn.net/Articles/718628/
and https://lwn.net/Articles/720550/) agrees with Alan.  Here is
a litmus test, which emulates spin_lock() with xchg_acquire() and
spin_unlock() with smp_store_release():

------------------------------------------------------------------------

C C-ManfredSpraul-L1G1xchgnr.litmus

(* Expected result: Never.  *)

{
}

P0(int *nfcla, spinlock_t *gbl, int *gbl_held, spinlock_t *lcl, int *lcl_held)
{
	/* Acquire local lock. */
	r10 = xchg_acquire(lcl, 1);
	r1 = READ_ONCE(*nfcla);
	if (r1) {
		smp_store_release(lcl, 0);
		r11 = xchg_acquire(gbl, 1);
		r12 = xchg_acquire(lcl, 1);
		smp_store_release(gbl, 0);
	}
	r2 = READ_ONCE(*gbl_held);
	WRITE_ONCE(*lcl_held, 1);
	WRITE_ONCE(*lcl_held, 0);
	smp_store_release(lcl, 0);
}

P1(int *nfcla, spinlock_t *gbl, int *gbl_held, spinlock_t *lcl, int *lcl_held)
{
	/* Acquire global lock. */
	r10 = xchg_acquire(gbl, 1);
	WRITE_ONCE(*nfcla, 1);
	r11 = xchg_acquire(lcl, 1);
	smp_store_release(lcl, 0);
	r2 = READ_ONCE(*lcl_held);
	WRITE_ONCE(*gbl_held, 1);
	WRITE_ONCE(*gbl_held, 0);
Where is the write that resets nfcla=0?
	smp_store_release(gbl, 0);
}

exists
((0:r2=1 \/ 1:r2=1) /\ 0:r10=0 /\ 0:r11=0 /\ 0:r12=0 /\ 1:r10=0 /\ 1:r11=0)

------------------------------------------------------------------------

The memory model says that the forbidden state does not happen:
[...]
Manfred, any objections to my changing your patch as Alan suggests?
I tried to pair the memory barriers:
nf_conntrack_all_unlock() contains a smp_store_release().
What does that pair with?

--
    Manfred



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux