Re: Security issue in NFS localio

Dave Chinner <david@xxxxxxxxxxxxx> · Fri, 5 Jul 2024 09:25:56 +1000

On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote:
> 
> 
> > On Jul 3, 2024, at 6:24 PM, NeilBrown <neilb@xxxxxxx> wrote:
> > 
> > 
> > I've been pondering security questions with localio - particularly
> > wondering what questions I need to ask.  I've found three focal points
> > which overlap but help me organise my thoughts:
> > 1- the LOCALIO RPC protocol
> > 2- the 'auth_domain' that nfsd uses to authorise access
> > 3- the credential that is used to access the file
> > 
> > 1/ It occurs to me that I could find out the UUID reported by a given
> > local server (just ask it over the RPC connection), find out the
> > filehandle for some file that I don't have write access to (not too
> > hard), and create a private NFS server (hacking nfs-ganasha?) which
> > reports the same uuid and reports that I have access to a file with
> > that filehandle.  If I then mount from that server inside a private
> > container on the same host that is running the local server, I would get
> > localio access to the target file.

This seems amazingly complex for something that is actually really
simple.  Keep in mind that I am speaking from having direct
experience with developing and maintaining NFS client IO bypass
infrastructure from when I worked at SGI as an NFS engineer.

So, let's look at the Irix NFS client/server and the "Bulk Data
Service" protocol extensions that SGI wrote for NFSv3 back in the
mid 1990s.  Here's an overview from the 1996 product documentation
"Getting Started with BDSpro":

https://irix7.com/techpubs/007-3274-001.pdf

At least read chapter 1 so you grok the fundamentals of how the IO
bypass worked. It should look familiar, because it isn't very
different to how NFS over RDMA or client side IO for pNFS works.

Essentially, The NFS client transparently sent all the data IO (read
and write) over a separate communications channel for any IO that
met the size and alignment constraints. This was effectively a
"remote-IO" bypass that streamed data rather than packetised it
(NFS_READ/NFS_WRITE is packetised data with RTT latency issues).
By getting rid of the round trip latency penalty, data could be
sent/recieved at full network throughput rates.

[ As an aside, the BDS side channel was also the mechanism that used
by SGI for NFS over RDMA with custom full stack network offload
hardware back in the mid 1990s. NFS w/ BDS ran at about 800MB/s on
those networks on machines with 200MHz CPUs (think MIPS r10k). ]

The client side userspace has no idea this low level protocol
hijacking occurs, and it doesn't need to because all it changes
is the read/write IO speed. The NFS protocol is still used for all
authorisation, access checks, metadata operations, etc, and all that
changes is how NFS_READ and NFS_WRITE operations are performed.

The local-io stuff is no different - we're just using a different
client side IO path in kernel. We don't need a new protocol, nor do
we need userspace to be involved *at all*.  The kernel NFS client
can easily discover that it is on the same host as the server. The
server already does this "client is on the same host", so both will
then know they can *transparently* enable the localio bypass without
involving userspace at all.

The NFS protocol still provides all the auth, creds, etc to allow
the NFS client read and write access to the file. The NFS server
provides the client with a filehandle build by the underlying
filesystem for the file the NFS client has been permission to
access.

The local filesystem will accept that filehandle from any kernel
side context via the export ops for that filesystem. This provides
a mechanism for the NFS client to convert that to a dentry
and so open the file directly from the file handle. This is what the
server already does, so it should be able to share the filehandle
decode and open code from the server, maybe even just reach into the
server export table directly....

IOWs, we don't need to care about whether the mount is visible to
the NFS client - the filesystem *export* is visible to the *kernel*
and the export ops allow unfettered filehandle decoding. Containers
are irrelevant - the server has granted access to the file, and so
the NFS client has effective permissions to resolve the filehandle
directly..

Fundamentally, this is the same permission and access model that
pNFS is built on. Hence I don't understand why this local-io bypass
needs something completely new and seemingly very complex...

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx