Re: [RFC PATCH 00/13] Ultra Ethernet driver introduction

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 14, 2025 at 02:53:40PM +0000, Bernard Metzler wrote:

> I assume the correct way forward is to first clarify the
> structure of all user-visible objects that need to be
> created/controlled/destroyed, and to route them through
> this interface. Some will require extensions to given objects,
> some may be new, some will be as-is. rdma_netlink will probably
> be the right interface to look at for job control.

As I understand the job ID model you will need to have some privileged
entity to create a "job ID file descriptor" that can be passed around
to unprivileged processes to grant them access to the job ID. This is
necessary since the Job ID becomes part of the packet headers and we
must secure userspace to prevent a hijack or spoof these values on the
wire.

Netlink has a major downside that you can't use filesystem ACL
permissions to control access, so building a low privilege daemon just
to do job id management seems to me to be more difficult.

As an example, I would imagine having a job management char device
with a filesystem ACL that only allows something like SLRUM's
privileged orchestrator to talk to it. SLURM wouldn't have something
like CAP_NET_ADMIN. SLURM would setup the job ID and pass the "Job ID
FD" to the actual MPI workload processes to grant them permission to
use those network headers.

Nobody else in the system can create Job ID's besides SLURM, and in a
multi-user environment one user cannot reach into the other and hijack
their job ID because the FD does not leak outside the MPI process
tree.

This RFC doesn't describe the intended security model, but I'm very
surprised to see ultraeth_nl_job_new_doit() not do any capability
checks, or any security what so ever around access to the job.

It should be obvious that this would be a fairly trivial add on to
rdma, a new char dev, some rdma netlink to report and inspect the
global job list, and a little driver helper to associate job FDs with
uverbs objects and retrieve the job id. In fact we just did something
similar with the UCAP system..

Further, jobs are not a concept unique to UE, I am seeing other RDMA
scenarios talk about jobs now, perhaps inspired by UE. There is zero
reason to make jobs a UE specific concept and not a general RDMA
concept.

Jason




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux