On 2025/3/6 13:06, Jiayuan Chen wrote:
On Wed, Mar 05, 2025 at 10:12:43AM +0800, Cong Wang wrote:
On Wed, Mar 05, 2025 at 10:02:34PM +0800, Dong Chenchen wrote:
WARNING: CPU: 0 PID: 6558 at net/core/sock_map.c:1703 sock_map_close+0x3c4/0x480
Modules linked in:
CPU: 0 UID: 0 PID: 6558 Comm: syz-executor.14 Not tainted 6.14.0-rc5+ #238
RIP: 0010:sock_map_close+0x3c4/0x480
Call Trace:
<TASK>
inet_release+0x144/0x280
__sock_release+0xb8/0x270
sock_close+0x1e/0x30
__fput+0x3c6/0xb30
__fput_sync+0x7b/0x90
__x64_sys_close+0x90/0x120
do_syscall_64+0x5d/0x170
entry_SYSCALL_64_after_hwframe+0x76/0x7e
The root cause is:
sock_hash_update_common
sock_map_unref
sock_map_del_link
psock->psock_update_sk_prot(sk, psock, false);
//false won't restore proto
sk_psock_put
rcu_assign_sk_user_data(sk, NULL);
inet_release
sk->sk_prot->close
sock_map_close
WARN(sk->sk_prot->close == sock_map_close)
When psock is removed from sockmap, sock_map_del_link() still set
sk->sk_prot to bpf proto instead of restore it (for incorrect restore
value). sock release will triger warning of sock_map_close() for
recurse after psock drop.
But sk_psock_drop() restores it with sk_psock_restore_proto() after the
psock reference count goes to zero. So how could the above happen?
By the way, it would be perfect if you could add a test case for it
together with this patch (a followup patch is fine too).
Thanks!
I also have the same question as Cong, and I'll describe it in more detail
here:
'psock->saved_close' is always tcp_close (if your socket is TCP) and will
not change regardless of whether restore is executed or not. So when
entering the function sock_map_close() and encountering
WARN_ON_ONCE(saved_close == sock_map_close), 'saved_close' can only come
from 'saved_close = READ_ONCE(sk->sk_prot)->close'. This means we obtain
sock through psock = sk_psock(sk) and then enter the branch code after
judging it to be null.
'''
sock_map_close()
{
psock = sk_psock(sk);
if (likely(psock)) {
saved_close = psock->saved_close;
} else {
saved_close = READ_ONCE(sk->sk_prot)->close;
}
WARN_ON_ONCE(saved_close == sock_map_close);
}
'''
However, before psock becomes null, we have actually successfully executed
the restore:
'''
void sk_psock_drop(struct sock *sk, struct sk_psock *psock)
{
write_lock_bh(&sk->sk_callback_lock);
sk_psock_restore_proto(sk, psock); // restore correctly
rcu_assign_sk_user_data(sk, NULL); // set psock null
...
}
'''
Passing false to psock_update_sk_prot may be problematic, but it shouldn't
cause the issue described in the commit message.
It may be necessary to provide more information on how sockmap is used to
determine the issue. :)
Thanks.
Thanks for your review!
sk_psock_restore_protofail to restore sk_prot for tcp_ulp set concurrently.
sock_hash_update_common
sock_map_unref
sock_map_del_link
setsockopt(TCP_ULP)
tcp_set_ulp
icsk->icsk_ulp_ops = ulp_ops; //set ulp
sk_psock_put
sk_psock_restore_proto
tcp_bpf_update_proto
if (inet_csk_has_ulp(sk))
tls_update
ctx->sk_proto = p; //not update sk->sk_prot for ulp set
inet_release
sk->sk_prot->close
sock_map_close
WARN(sk->sk_prot->close == sock_map_close)
Maybe we can fix it as:
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 99ca4465f702..791cc32dccfd 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -1025,15 +1025,22 @@ static int tls_init(struct sock *sk)
static void tls_update(struct sock *sk, struct proto *p,
void (*write_space)(struct sock *sk))
{
+ struct sk_psock *psock = sk_psock(sk);
struct tls_context *ctx;
+ bool restore = true;
WARN_ON_ONCE(sk->sk_prot == p);
+ if (psock)
+ restore = !!refcount_read(&psock->refcnt);
+
ctx = tls_get_ctx(sk);
if (likely(ctx)) {
ctx->sk_write_space = write_space;
ctx->sk_proto = p;
- } else {
+ }
+
+ if (ctx == NULL || !restore) {
/* Pairs with lockless read in sk_clone_lock(). */
WRITE_ONCE(sk->sk_prot, p);
sk->sk_write_space = write_space;