On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote: > > > > On Jul 3, 2024, at 6:24 PM, NeilBrown <neilb@xxxxxxx> wrote: > > > > > > I've been pondering security questions with localio - particularly > > wondering what questions I need to ask. I've found three focal points > > which overlap but help me organise my thoughts: > > 1- the LOCALIO RPC protocol > > 2- the 'auth_domain' that nfsd uses to authorise access > > 3- the credential that is used to access the file > > > > 1/ It occurs to me that I could find out the UUID reported by a given > > local server (just ask it over the RPC connection), find out the > > filehandle for some file that I don't have write access to (not too > > hard), and create a private NFS server (hacking nfs-ganasha?) which > > reports the same uuid and reports that I have access to a file with > > that filehandle. If I then mount from that server inside a private > > container on the same host that is running the local server, I would get > > localio access to the target file. This seems amazingly complex for something that is actually really simple. Keep in mind that I am speaking from having direct experience with developing and maintaining NFS client IO bypass infrastructure from when I worked at SGI as an NFS engineer. So, let's look at the Irix NFS client/server and the "Bulk Data Service" protocol extensions that SGI wrote for NFSv3 back in the mid 1990s. Here's an overview from the 1996 product documentation "Getting Started with BDSpro": https://irix7.com/techpubs/007-3274-001.pdf At least read chapter 1 so you grok the fundamentals of how the IO bypass worked. It should look familiar, because it isn't very different to how NFS over RDMA or client side IO for pNFS works. Essentially, The NFS client transparently sent all the data IO (read and write) over a separate communications channel for any IO that met the size and alignment constraints. This was effectively a "remote-IO" bypass that streamed data rather than packetised it (NFS_READ/NFS_WRITE is packetised data with RTT latency issues). By getting rid of the round trip latency penalty, data could be sent/recieved at full network throughput rates. [ As an aside, the BDS side channel was also the mechanism that used by SGI for NFS over RDMA with custom full stack network offload hardware back in the mid 1990s. NFS w/ BDS ran at about 800MB/s on those networks on machines with 200MHz CPUs (think MIPS r10k). ] The client side userspace has no idea this low level protocol hijacking occurs, and it doesn't need to because all it changes is the read/write IO speed. The NFS protocol is still used for all authorisation, access checks, metadata operations, etc, and all that changes is how NFS_READ and NFS_WRITE operations are performed. The local-io stuff is no different - we're just using a different client side IO path in kernel. We don't need a new protocol, nor do we need userspace to be involved *at all*. The kernel NFS client can easily discover that it is on the same host as the server. The server already does this "client is on the same host", so both will then know they can *transparently* enable the localio bypass without involving userspace at all. The NFS protocol still provides all the auth, creds, etc to allow the NFS client read and write access to the file. The NFS server provides the client with a filehandle build by the underlying filesystem for the file the NFS client has been permission to access. The local filesystem will accept that filehandle from any kernel side context via the export ops for that filesystem. This provides a mechanism for the NFS client to convert that to a dentry and so open the file directly from the file handle. This is what the server already does, so it should be able to share the filehandle decode and open code from the server, maybe even just reach into the server export table directly.... IOWs, we don't need to care about whether the mount is visible to the NFS client - the filesystem *export* is visible to the *kernel* and the export ops allow unfettered filehandle decoding. Containers are irrelevant - the server has granted access to the file, and so the NFS client has effective permissions to resolve the filehandle directly.. Fundamentally, this is the same permission and access model that pNFS is built on. Hence I don't understand why this local-io bypass needs something completely new and seemingly very complex... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx