Patch "Revert "tcp: avoid the lookup process failing to get sk in ehash table"" has been added to the 6.4-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    Revert "tcp: avoid the lookup process failing to get sk in ehash table"

to the 6.4-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     revert-tcp-avoid-the-lookup-process-failing-to-get-s.patch
and it can be found in the queue-6.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit ecd467dd886c50804703a2c430a0a51d19acb739
Author: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx>
Date:   Mon Jul 17 14:59:18 2023 -0700

    Revert "tcp: avoid the lookup process failing to get sk in ehash table"
    
    [ Upstream commit 81b3ade5d2b98ad6e0a473b0e1e420a801275592 ]
    
    This reverts commit 3f4ca5fafc08881d7a57daa20449d171f2887043.
    
    Commit 3f4ca5fafc08 ("tcp: avoid the lookup process failing to get sk in
    ehash table") reversed the order in how a socket is inserted into ehash
    to fix an issue that ehash-lookup could fail when reqsk/full sk/twsk are
    swapped.  However, it introduced another lookup failure.
    
    The full socket in ehash is allocated from a slab with SLAB_TYPESAFE_BY_RCU
    and does not have SOCK_RCU_FREE, so the socket could be reused even while
    it is being referenced on another CPU doing RCU lookup.
    
    Let's say a socket is reused and inserted into the same hash bucket during
    lookup.  After the blamed commit, a new socket is inserted at the end of
    the list.  If that happens, we will skip sockets placed after the previous
    position of the reused socket, resulting in ehash lookup failure.
    
    As described in Documentation/RCU/rculist_nulls.rst, we should insert a
    new socket at the head of the list to avoid such an issue.
    
    This issue, the swap-lookup-failure, and another variant reported in [0]
    can all be handled properly by adding a locked ehash lookup suggested by
    Eric Dumazet [1].
    
    However, this issue could occur for every packet, thus more likely than
    the other two races, so let's revert the change for now.
    
    Link: https://lore.kernel.org/netdev/20230606064306.9192-1-duanmuquan@xxxxxxxxx/ [0]
    Link: https://lore.kernel.org/netdev/CANn89iK8snOz8TYOhhwfimC7ykYA78GA3Nyv8x06SZYa1nKdyA@xxxxxxxxxxxxxx/ [1]
    Fixes: 3f4ca5fafc08 ("tcp: avoid the lookup process failing to get sk in ehash table")
    Signed-off-by: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx>
    Link: https://lore.kernel.org/r/20230717215918.15723-1-kuniyu@xxxxxxxxxx
    Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index e7391bf310a75..0819d6001b9ab 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -650,20 +650,8 @@ bool inet_ehash_insert(struct sock *sk, struct sock *osk, bool *found_dup_sk)
 	spin_lock(lock);
 	if (osk) {
 		WARN_ON_ONCE(sk->sk_hash != osk->sk_hash);
-		ret = sk_hashed(osk);
-		if (ret) {
-			/* Before deleting the node, we insert a new one to make
-			 * sure that the look-up-sk process would not miss either
-			 * of them and that at least one node would exist in ehash
-			 * table all the time. Otherwise there's a tiny chance
-			 * that lookup process could find nothing in ehash table.
-			 */
-			__sk_nulls_add_node_tail_rcu(sk, list);
-			sk_nulls_del_node_init_rcu(osk);
-		}
-		goto unlock;
-	}
-	if (found_dup_sk) {
+		ret = sk_nulls_del_node_init_rcu(osk);
+	} else if (found_dup_sk) {
 		*found_dup_sk = inet_ehash_lookup_by_sk(sk, list);
 		if (*found_dup_sk)
 			ret = false;
@@ -672,7 +660,6 @@ bool inet_ehash_insert(struct sock *sk, struct sock *osk, bool *found_dup_sk)
 	if (ret)
 		__sk_nulls_add_node_rcu(sk, list);
 
-unlock:
 	spin_unlock(lock);
 
 	return ret;
diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index 40052414c7c71..2c1b245dba8e8 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -88,10 +88,10 @@ void inet_twsk_put(struct inet_timewait_sock *tw)
 }
 EXPORT_SYMBOL_GPL(inet_twsk_put);
 
-static void inet_twsk_add_node_tail_rcu(struct inet_timewait_sock *tw,
-					struct hlist_nulls_head *list)
+static void inet_twsk_add_node_rcu(struct inet_timewait_sock *tw,
+				   struct hlist_nulls_head *list)
 {
-	hlist_nulls_add_tail_rcu(&tw->tw_node, list);
+	hlist_nulls_add_head_rcu(&tw->tw_node, list);
 }
 
 static void inet_twsk_add_bind_node(struct inet_timewait_sock *tw,
@@ -144,7 +144,7 @@ void inet_twsk_hashdance(struct inet_timewait_sock *tw, struct sock *sk,
 
 	spin_lock(lock);
 
-	inet_twsk_add_node_tail_rcu(tw, &ehead->chain);
+	inet_twsk_add_node_rcu(tw, &ehead->chain);
 
 	/* Step 3: Remove SK from hash chain */
 	if (__sk_nulls_del_node_init_rcu(sk))



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux