From: Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> Date: Sat, 26 Oct 2024 00:35:30 +0000 > On Fri, 2024-10-25 at 14:20 -0700, Kuniyuki Iwashima wrote: > > From: "liujian (CE)" <liujian56@xxxxxxxxxx> > > Date: Fri, 25 Oct 2024 11:32:52 +0800 > > > > > > If not, then what prevents it from happening? > > > > > The socket created by the userspace program obtains the > > > > > reference > > > > > counting of the namespace, but the kernel socket does not. > > > > > > > > > > There's some discussion here: > > > > > https://lore.kernel.org/all/CANn89iJE5anTbyLJ0TdGAqGsE+GichY3YzQECjNUVMz=G3bcQg@xxxxxxxxxxxxxx/ > > > > OK... So then it looks to me as if NFS, SMB, AFS, and any other > > > > networked filesystem that can be started from inside a container > > > > is > > > > going to need to do the same thing that rds appears to be doing. > > > > FWIW, recently we saw a similar UAF on CIFS. > > > > > > > > > > > > Should there perhaps be a helper function in the networking layer > > > > for > > > > this? > > > > > > There should be no such helper function at present, right?. > > > > > > If get net's reference to fix this problem, the following test is > > > performed. There's nothing wrong with this case. I don't know if > > > there's > > > anything else to consider. > > > > > > I don't have any other ideas other than these two methods. Do you > > > have > > > any suggestions on this problem? @Eric @Jakub ... @All > > > > The netns lifetime should be managed by the upper layer rather than > > the networking layer. If the netns is already dead, the upper layer > > must discard the net pointer anyway. > > > > I suggest checking maybe_get_net() in NFS, CIFS, etc and then calling > > __sock_create() with kern 0. > > > > Thanks for the suggestion, but we already manage the netns lifetime in > the RPC layer. A reference is taken when the filesystem is being > mounted. It is dropped when the filesystem is being unmounted. > > The problem is the TCP timer races on shutdown. There is no interest in > having to manage that in the RPC layer. Does that mean netns is always alive when the socket is created in svc_create_socket() or xs_create_sock() ? If so, you can just use __sock_create(kern=0) there to prevent net from being freed before the socket. sock_create_kern() and kern@ are confusing, and we had similar issues in other kernel TCP socket users SMC/RDS, so I'll rename them to sock_create_noref() and no_net_ref@ or something.