On Thu, Jun 13, 2024 at 3:41 PM Kuniyuki Iwashima <kuniyu@xxxxxxxxxx> wrote: > > From: Ignat Korchagin <ignat@xxxxxxxxxxxxxx> > Date: Wed, 12 Jun 2024 14:22:36 -0400 > > > And curious if bpf_get_socket_cookie() can be called any socket > > > family to trigger the splat. e.g. ieee802154_create() seems to > > > have the same bug. > > > > Just judging from the code - yes, indeed. > > > > > If so, how about clearing sock->sk in sk_common_release() ? > > > > This was my first thought, but I was a bit put off by the fact that > > sk_common_release() is called from many places and the sk object > > itself is reference counted. So not every call to sk_common_release() > > seems to actually free the sk object. > > sk_common_release() is called > > 1. when we fail to create a socket (socket() or accept() syscall) > 2. when we release the last refcount of the socket's file descriptor > (basically close() syscall) > > The issue only happens at 1. because we clear sock->sk at 2. in > __sock_release() after calling sock->ops->release(). > > So, we need not take care of these callers of sk_common_release(). > > - inet_release > - ->close() > - udp_lib_close > - ping_close > - raw_close > - rawv6_close > - l2tp_ip_close > - l2tp_ip6_close > - sctp_close > - ieee802154_sock_release > - ->close() > - raw_close > - dgram_close > - mctp_release > - ->close() > - mctp_sk_close > - pn_socket_release > - ->close() > - pn_sock_close > - pep_sock_close > > Then, the rest of the callers are: > > - __sock_create > - pf->create() > - inet_create > - inet6_create > - ieee802154_create > - smc_create > - __smc_create > > - setsockopt(TCP_ULP) > - smc_ulp_init > - __smc_create > > - sctp_accept > - sctp_v4_create_accept_sk > - sctp_v6_create_accept_sk > > we need not care about sctp_v[46]_create_accept_sk() because they don't set > sock->sk for the socket; we don't pass sock to sock_init_data(NULL, newsk) > before calling sk_common_release(). > > __sock_create() path and SMC's ULP path have the same issue, and > sk_common_release() releases the last refcount of struct sock there. > > So, I think we can set NULL to sock->sk in sk_common_release(). Thanks for the explanation. Makes sense. I'll spin up a v2 with this (and try to test it as well). > > > Secondly, I was put off by this > > comment (which I don't fully understand TBH) [1] > > > > On the other hand - in inet/inet6_create() we definitely know that the > > object would be freed, because we just created that. > > > > But if someone more familiar with the code confirms it is > > better/possible to do in sk_common_release(), I'm happy to adjust and > > it would be cleaner indeed. > > > > > ---8<--- > > > diff --git a/net/core/sock.c b/net/core/sock.c > > > index 8629f9aecf91..bbc94954d9bf 100644 > > > --- a/net/core/sock.c > > > +++ b/net/core/sock.c > > > @@ -3754,6 +3754,9 @@ void sk_common_release(struct sock *sk) > > > * until the last reference will be released. > > > */ > > > > > > + if (sk->sk_socket) > > > + sk->sk_socket->sk = NULL; > > > + > > > sock_orphan(sk); > > > > > > xfrm_sk_free_policy(sk); > > > ---8<--- > > > > [1]: https://elixir.bootlin.com/linux/v6.10-rc3/source/include/net/sock.h#L1985 >