Re: [PATCH v1 net 2/2] rds: tcp: Fix use-after-free of net in reqsk_timer_handler().

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Eric Dumazet <edumazet@xxxxxxxxxx>
Date: Mon, 26 Feb 2024 21:00:07 +0100
> On Mon, Feb 26, 2024 at 8:48 PM Kuniyuki Iwashima <kuniyu@xxxxxxxxxx> wrote:
> >
> > From: Eric Dumazet <edumazet@xxxxxxxxxx>
> > Date: Mon, 26 Feb 2024 20:40:10 +0100
> > > On Mon, Feb 26, 2024 at 8:22 PM Allison Henderson
> > > <allison.henderson@xxxxxxxxxx> wrote:
> > > >
> > > > On Mon, 2024-02-26 at 11:14 -0800, Kuniyuki Iwashima wrote:
> > > > > From: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx>
> > > > > Date: Fri, 23 Feb 2024 10:28:32 -0800
> > > > > > From: Eric Dumazet <edumazet@xxxxxxxxxx>
> > > > > > Date: Fri, 23 Feb 2024 19:09:27 +0100
> > > > > > > On Fri, Feb 23, 2024 at 6:26 PM Kuniyuki Iwashima
> > > > > > > <kuniyu@xxxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > syzkaller reported a warning of netns tracker [0] followed by
> > > > > > > > KASAN
> > > > > > > > splat [1] and another ref tracker warning [1].
> > > > > > > >
> > > > > > > > syzkaller could not find a repro, but in the log, the only
> > > > > > > > suspicious
> > > > > > > > sequence was as follows:
> > > > > > > >
> > > > > > > >   18:26:22 executing program 1:
> > > > > > > >   r0 = socket$inet6_mptcp(0xa, 0x1, 0x106)
> > > > > > > >   ...
> > > > > > > >   connect$inet6(r0, &(0x7f0000000080)={0xa, 0x4001, 0x0,
> > > > > > > > @loopback}, 0x1c) (async)
> > > > > > > >
> > > > > > > > The notable thing here is 0x4001 in connect(), which is
> > > > > > > > RDS_TCP_PORT.
> > > > > > > >
> > > > > > > > So, the scenario would be:
> > > > > > > >
> > > > > > > >   1. unshare(CLONE_NEWNET) creates a per netns tcp listener in
> > > > > > > >       rds_tcp_listen_init().
> > > > > > > >   2. syz-executor connect()s to it and creates a reqsk.
> > > > > > > >   3. syz-executor exit()s immediately.
> > > > > > > >   4. netns is dismantled.  [0]
> > > > > > > >   5. reqsk timer is fired, and UAF happens while freeing
> > > > > > > > reqsk.  [1]
> > > > > > > >   6. listener is freed after RCU grace period.  [2]
> > > > > > > >
> > > > > > > > Basically, reqsk assumes that the listener guarantees netns
> > > > > > > > safety
> > > > > > > > until all reqsk timers are expired by holding the listener's
> > > > > > > > refcount.
> > > > > > > > However, this was not the case for kernel sockets.
> > > > > > > >
> > > > > > > > Commit 740ea3c4a0b2 ("tcp: Clean up kernel listener's reqsk in
> > > > > > > > inet_twsk_purge()") fixed this issue only for per-netns ehash,
> > > > > > > > but
> > > > > > > > the issue still exists for the global ehash.
> > > > > > > >
> > > > > > > > We can apply the same fix, but this issue is specific to RDS.
> > > > > > > >
> > > > > > > > Instead of iterating potentially large ehash and purging reqsk
> > > > > > > > during
> > > > > > > > netns dismantle, let's hold netns refcount for the kernel TCP
> > > > > > > > listener.
> > > > > > > >
> > > > > > > >
> > > > > > > > Reported-by: syzkaller <syzkaller@xxxxxxxxxxxxxxxx>
> > > > > > > > Fixes: 467fa15356ac ("RDS-TCP: Support multiple RDS-TCP listen
> > > > > > > > endpoints, one per netns.")
> > > > > > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx>
> > > > > > > > ---
> > > > > > > >  net/rds/tcp_listen.c | 5 +++++
> > > > > > > >  1 file changed, 5 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
> > > > > > > > index 05008ce5c421..4f7863932df7 100644
> > > > > > > > --- a/net/rds/tcp_listen.c
> > > > > > > > +++ b/net/rds/tcp_listen.c
> > > > > > > > @@ -282,6 +282,11 @@ struct socket *rds_tcp_listen_init(struct
> > > > > > > > net *net, bool isv6)
> > > > > > > >                 goto out;
> > > > > > > >         }
> > > > > > > >
> > > > > > > > +       __netns_tracker_free(net, &sock->sk->ns_tracker,
> > > > > > > > false);
> > > > > > > > +       sock->sk->sk_net_refcnt = 1;
> > > > > > > > +       get_net_track(net, &sock->sk->ns_tracker, GFP_KERNEL);
> > > > > > > > +       sock_inuse_add(net, 1);
> > > > > > > > +
> > > > > > >
> > > > > > > Why using sock_create_kern() then later 'convert' this kernel
> > > > > > > socket
> > > > > > > to a user one ?
> > > > > > >
> > > > > > > Would using __sock_create() avoid this ?
> > > > > >
> > > > > > I think yes, but LSM would see kern=0 in pre/post socket() hooks.
> > > > > >
> > > > > > Probably we can use __sock_create() in net-next and see if someone
> > > > > > complains.
> > > > >
> > > > > I noticed the patchwork status is Changes Requested.
> > > > > https://urldefense.com/v3/__https://patchwork.kernel.org/project/netdevbpf/list/?series=829213&state=*__;Kg!!ACWV5N9M2RV99hQ!KHKUQKUDnNCdiEcb4ZK1VBiYSitarEb-CAWeSJvaeK04fgW4cuWePg3Ac2HmIAPUHuqeCwgt466fHEKAAdfa$
> > > > >
> > > > >
> > > > > Should we use __sock_create() for RDS or add another parameter
> > > > > to __sock_create(..., kern=true/false, netref=true/false) and
> > > > > fix other similar uses (MPTCP, SMC, Netlink) altogether ?
> > > > >
> > > > > Thanks!
> > > >
> > > > Hi all,
> > > >
> > > > Thank you for looking at this.  I've been doing a little investigation
> > > > in the area to better understand the issue and this fix.  While I
> > > > understand what this patch is trying to do here, I'd like to do a
> > > > little more digging as to why 740ea3c4a0b2 didnt work for rds, or what
> > > > else rds may not be doing correctly that the other sockets are.  I'm
> > > > not quite sure about setting the kern parameter to 0 for socket_create.
> > > > While it seems like it would have a similar effect, this looks
> > > > incorrect since this is not a user space socket.
> > > >
> > > > I'll do a little more diging myself too.  If you had another idea about
> > > > adding parameters to __sock_create, I'd be happy to take a look.  Thank
> > > > you!
> > >
> > > I wonder if the following change would help ?
> >
> > Yes, it also fixes the issue. :)
> > https://lore.kernel.org/netdev/20240226193857.69672-1-kuniyu@xxxxxxxxxx/
> >
> > but it will trigger full ehash iteration for netns with no RDS usage
> > (and even without TCP).
> >
> > So, I think __sock_create() or the netref conversion would be better.
> 
> Maybe, although you could add debugging/assertions to make sure no
> TCP_NEW_SYN_RECV request are created on behalf of a kernel socket...
> 
> I am pretty sure other layers in the kernel use kernel socket TCP
> listeners, SUNRPC for instance.

That sounds nice!

I'll include this diff in v2.

---8<--
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index df7b13f0e5e0..d949501d13dc 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6972,6 +6972,8 @@ struct request_sock *inet_reqsk_alloc(const struct request_sock_ops *ops,
 	if (req) {
 		struct inet_request_sock *ireq = inet_rsk(req);
 
+		DEBUG_NET_WARN_ON_ONCE(!sk_listener->sk_net_refcnt);
+
 		ireq->ireq_opt = NULL;
 #if IS_ENABLED(CONFIG_IPV6)
 		ireq->pktopts = NULL;
---8<---




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux