Re: [PATCH 10/10] svcrdma: Documentation update for the FastReg memory model

Tom Tucker <tom@xxxxxxxxxxxxxxxxxxxxx> · Wed, 01 Oct 2008 19:38:11 -0500

J. Bruce Fields wrote:
Thanks, I think this is much more helpful.

On Tue, Sep 30, 2008 at 03:17:21PM -0500, Tom Tucker wrote:
+Security
+--------
+
+  NFSRDMA exploits the RDMA capabilities of the IB and iWARP
+  transports to more efficiently exchange RPC data between the client
+  and the server. This section discusses the security implications of
+  the exchange of memory information on the wire when the wire may be
+  monitorable by an untrusted application. The identifier that
+  encapsulates this memory information is called an RKEY.
+
+  A principal exploit is that a node on the local network could snoop
+  RDMA packets containing RKEY and then forge a packet with this RKEY
+  to write and/or read the memory of the peer to which the RKEY
+  referred.
+
+  If the underlying RDMA device is capable of Fast Memory
+  Registration, then NFSRDMA is no less secure than TCP with
+  auth_unix. However, if the device does not support Fast Memory
+  Registration, then such a node could write anywhere in the server's
+  memory using the method above. At mount time, the server sends a

The server doesn't really know about mounts, especially not at this
level, so I assume you mean either server start time or client connect
time?

Right, client connect time, I'll fix. Thanks.

+  string to the message log to indicate whether or not Fast Memory
+  Registration is being used. If Fast Memory Registration is being
+  used, the string
+
+	"svcrdma: Using Fast Memory Registration"
+
+  is logged, otherwise,
+
+	"svcrdma: Using a Global DMA MR"
+
+  will be logged.

It'd be nicer to have something that can be queried by a program--a file
in proc or nfsd, for example--without having to grep through log files.
(Or is it possible the drivers already export enough information under
sysfs someplace to figure this out with a simple script?)

Yes, it's gross. But I was trying to keep it simple for the first go-round and
since it is conceivable that you have two adapters, one that supports FRMR and
the other doesn't, you would need a proc file per adapter. All my systems have
both iWARP and IB adapters in them. So half my connections are DMA MR and the
other FRMR.

Or maybe the non-fast registration stuff should be under a separate
configuration option entirely?  Distro's could eventually enable only
the safer configurations and people doing testing could build their own
kernels with the rest enabled.

Perhaps, or maybe a module option that specifically disables DMA_MR. Also
note that with IB the DMA MR is RKEY is not put on the wire so I think I
need to qualify the warning somewhat.

My initial impulse is to be a bit scared of the non-fast-registration
case, but maybe I don't understand how this hardware is deployed.

In practice, I think the exposure is real, but somewhat academic.
Obviously as this sees wider adoption the likelihood that this could be
deployed on a network with untrusted hosts grows significantly. Today
I don't believe that's the case.

I would lean towards the module option and a perhaps a Kconfig option that
allows you to tweak the default. I also think the policy should be transport
dependent. IOW, DMA MR is OK for IB, but verboten for iWARP.

Thanks for the feedback,
Tom

--b.

+
+  The sections below provide additional information on this issue.
+
+  The NFSRDMA protocol is defined such that a) only the server
+  initiates RDMA, and b) only the client's memory is exposed via
+  RKEY. This is why the server reads to fetch RPC data from the client
+  even though it would be more efficient for the client to write the
+  data to the server's memory. This design goal is not entirely
+  realized with iWARP, however, because the RKEY (called an STag on
+  iWARP) for the data sink of an RDMA_READ is actually placed on the
+  wire, and this RKEY has Remote Write permission. This means that the
+  server's memory is exposed by virtue of having placed the RKEY for
+  its local memory on the wire in order to receive the result of the
+  RDMA_READ.
+
+  By contrast, IB uses an opaque transaction ID# to associate the
+  READ_RPL with the READ_REQ and the data sink of an READ_REQ does not
+  require remote access. That said, the byzantine node in question
+  could forge a packet with this transaction ID and corrupt the target
+  memory, however, the scope of the exploit is bounded to the lifetime
+  of this single RDMA_READ request and to the memory mapped by the
+  data sink of the READ_REQ.
+
+  The newer RDMA adapters (both iWARP and IB) support "Fast Memory
+  Registration". This capability allows memory to be quickly
+  registered (i.e. made available for remote access) and de-registered
+  by submitting WR on the SQ. These capabilities provide a mechanism
+  to reduce the exposure discused above by limiting the scope of the
+  exploit. The idea is to create an RKEY that only maps the single RPC
+  and whose effective lifetime is only the exchange of this single
+  RPC. This is the default memory model that is employed by the server
+  when supported by the adapter and by the client when the
+  rdma_memreg_strategy is set to 6. Note that the client and server
+  may use different memory registration strategies, however,
+  performance is better when both the client and server use the
+  FastReg memory registration strategy.
+
+  This approach has two benefits, a) it restricts the domain of the
+  exploit to the memory of a single RPC, and b) it limits the duration
+  of the exploit to the time it takes to satisfy the RDMA_READ.
+
+  It is arguable that a one-shot STag/RKEY is no less secure than RPC
+  on the TCP transport. Consider that the exact same byzantine
+  application could more easily corrupt TCP RPC payload by simply
+  forging a packet with the correct TCP sequence number -- in fact
+  it's easier than the RDMA exploit because the RDMA exploit requires
+  that you correctly forge both the TCP packet and the RDMA
+  payload. In addition the duration of the TCP exploit is the lifetime
+  of the connection, not the lifetime of a single WR/RPC data transfer.
+
+  RDMA on IB or iWARP using Fast Reg is no less secure than TCP.
+

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html