Re: [PATCH v1 net 2/2] rds: tcp: Fix use-after-free of net in reqsk_timer_handler().

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2024-02-26 at 11:14 -0800, Kuniyuki Iwashima wrote:
> From: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx>
> Date: Fri, 23 Feb 2024 10:28:32 -0800
> > From: Eric Dumazet <edumazet@xxxxxxxxxx>
> > Date: Fri, 23 Feb 2024 19:09:27 +0100
> > > On Fri, Feb 23, 2024 at 6:26 PM Kuniyuki Iwashima
> > > <kuniyu@xxxxxxxxxx> wrote:
> > > > 
> > > > syzkaller reported a warning of netns tracker [0] followed by
> > > > KASAN
> > > > splat [1] and another ref tracker warning [1].
> > > > 
> > > > syzkaller could not find a repro, but in the log, the only
> > > > suspicious
> > > > sequence was as follows:
> > > > 
> > > >   18:26:22 executing program 1:
> > > >   r0 = socket$inet6_mptcp(0xa, 0x1, 0x106)
> > > >   ...
> > > >   connect$inet6(r0, &(0x7f0000000080)={0xa, 0x4001, 0x0,
> > > > @loopback}, 0x1c) (async)
> > > > 
> > > > The notable thing here is 0x4001 in connect(), which is
> > > > RDS_TCP_PORT.
> > > > 
> > > > So, the scenario would be:
> > > > 
> > > >   1. unshare(CLONE_NEWNET) creates a per netns tcp listener in
> > > >       rds_tcp_listen_init().
> > > >   2. syz-executor connect()s to it and creates a reqsk.
> > > >   3. syz-executor exit()s immediately.
> > > >   4. netns is dismantled.  [0]
> > > >   5. reqsk timer is fired, and UAF happens while freeing
> > > > reqsk.  [1]
> > > >   6. listener is freed after RCU grace period.  [2]
> > > > 
> > > > Basically, reqsk assumes that the listener guarantees netns
> > > > safety
> > > > until all reqsk timers are expired by holding the listener's
> > > > refcount.
> > > > However, this was not the case for kernel sockets.
> > > > 
> > > > Commit 740ea3c4a0b2 ("tcp: Clean up kernel listener's reqsk in
> > > > inet_twsk_purge()") fixed this issue only for per-netns ehash,
> > > > but
> > > > the issue still exists for the global ehash.
> > > > 
> > > > We can apply the same fix, but this issue is specific to RDS.
> > > > 
> > > > Instead of iterating potentially large ehash and purging reqsk
> > > > during
> > > > netns dismantle, let's hold netns refcount for the kernel TCP
> > > > listener.
> > > > 
> > > > 
> > > > Reported-by: syzkaller <syzkaller@xxxxxxxxxxxxxxxx>
> > > > Fixes: 467fa15356ac ("RDS-TCP: Support multiple RDS-TCP listen
> > > > endpoints, one per netns.")
> > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx>
> > > > ---
> > > >  net/rds/tcp_listen.c | 5 +++++
> > > >  1 file changed, 5 insertions(+)
> > > > 
> > > > diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
> > > > index 05008ce5c421..4f7863932df7 100644
> > > > --- a/net/rds/tcp_listen.c
> > > > +++ b/net/rds/tcp_listen.c
> > > > @@ -282,6 +282,11 @@ struct socket *rds_tcp_listen_init(struct
> > > > net *net, bool isv6)
> > > >                 goto out;
> > > >         }
> > > > 
> > > > +       __netns_tracker_free(net, &sock->sk->ns_tracker,
> > > > false);
> > > > +       sock->sk->sk_net_refcnt = 1;
> > > > +       get_net_track(net, &sock->sk->ns_tracker, GFP_KERNEL);
> > > > +       sock_inuse_add(net, 1);
> > > > +
> > > 
> > > Why using sock_create_kern() then later 'convert' this kernel
> > > socket
> > > to a user one ?
> > > 
> > > Would using __sock_create() avoid this ?
> > 
> > I think yes, but LSM would see kern=0 in pre/post socket() hooks.
> > 
> > Probably we can use __sock_create() in net-next and see if someone
> > complains.
> 
> I noticed the patchwork status is Changes Requested.
> https://urldefense.com/v3/__https://patchwork.kernel.org/project/netdevbpf/list/?series=829213&state=*__;Kg!!ACWV5N9M2RV99hQ!KHKUQKUDnNCdiEcb4ZK1VBiYSitarEb-CAWeSJvaeK04fgW4cuWePg3Ac2HmIAPUHuqeCwgt466fHEKAAdfa$
>  
> 
> Should we use __sock_create() for RDS or add another parameter
> to __sock_create(..., kern=true/false, netref=true/false) and
> fix other similar uses (MPTCP, SMC, Netlink) altogether ?
> 
> Thanks!

Hi all,

Thank you for looking at this.  I've been doing a little investigation
in the area to better understand the issue and this fix.  While I
understand what this patch is trying to do here, I'd like to do a
little more digging as to why 740ea3c4a0b2 didnt work for rds, or what
else rds may not be doing correctly that the other sockets are.  I'm
not quite sure about setting the kern parameter to 0 for socket_create.
While it seems like it would have a similar effect, this looks
incorrect since this is not a user space socket.  

I'll do a little more diging myself too.  If you had another idea about
adding parameters to __sock_create, I'd be happy to take a look.  Thank
you!

Allison






[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux