On Fri, 2018-01-05 at 13:35 -0700, Jason Gunthorpe wrote: > On Fri, Jan 05, 2018 at 03:23:55PM -0500, Doug Ledford wrote: > > On Fri, 2018-01-05 at 12:25 -0700, Jason Gunthorpe wrote: > > > On Fri, Jan 05, 2018 at 01:06:58PM -0500, Doug Ledford wrote: > > > > > Do the userspace daemon's still manage the connection to SRP? > > > > > > > > > > If yes, then the networking information should be relative to the > > > > > namespace of the thing that wrote to the sysfs file.. > > > > > > > > Maybe, maybe not. It depends on the implementation. IIRC you get one > > > > daemon per port, not one daemon per mount. > > > > > > I don't think it depends - if we expose this sysfs file to a container > > > > Who says we have to do that? We could make the sysfs file only visible > > in the init namespace and let the init namespace daemon control what > > namespaces have what views. > > What 'views'? It is a sysfs file controlled by the kernel - srp_daemon > has no control ove rit. Ok, allow me to clarify: restrict the sysfs file to create mappings to only the init_net namespace, and by views I meant allow the host srp_daemon to create a mapping with a specific namespace and that would then create a device file in that namespace, not a sysfs file. > > views anyway. We could just make that mandatory by refusing to create > > devices from anything other than init_net namespace. Then even if > > someone does mount sysfs rw in a container, we're still good. > > Usually we don't put that kind if policy in the kernel. No, we normally don't. However.... > Someone could run a priviledged container with full device access and > expect this stuff to work right. In that case it is certainly correct > for the srp_daemon and kernel to be in the namespace of the calling > process. > > > > So from a security perspective containers shouldn't even have access > > > to this thing at all without more work to ensure that the created > > > block device is also restriced inside the container. > > > > This isn't sufficient. The block device created must be constrained > > within the container, but if we allow direct device access to the > > underlying LUN on the target, then that target LUN must be exclusively > > owned by the container. > > Yes. That is done on the storage controller via ACLs of that LUN. But we broke that already... > The > container's net namespace would be restricted in some way that the ACL > can uniquely identify it - and the srp_daemon could run inside the > container. When we arguing over namespaces, especially as they related to IPoIB devices, we decided to allow the tuple to be p_key/qp/gid so that you can have to separate containers on the same p_key and gid with the differentiating factor being only the qp. If we then use that to target our SRP RDMACM connection, we've gone into an area where the target can't differentiate our container. So, yes, I think the storage target should control the ACLs too, but I'm concerned that we've gone down a path where that can't currently be done and changes will be required on the target for things to work. -- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: B826A3330E572FDD Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
Attachment:
signature.asc
Description: This is a digitally signed message part