Re: Using cgroup membership for resource access control?

Tejun Heo <tj@xxxxxxxxxx> · Mon, 6 Feb 2023 13:30:47 -1000

Hello,

On Mon, Feb 06, 2023 at 10:18:11PM +0000, Luck, Tony wrote:
> Imagine some AI training application with one process running per core on
> a server with a hundred or so cores. Each of these processes wants periodically
> to share work so far on a subset of the problem with one or more other processes.
> The "virtual windows" allow an accelerator device to copy data between a region
> in the source process (the owner of the virtual window) and another process that
> needs to access/supply updates.
> 
> Process tree is easy if the test is just "do these two tasks have the same getppid()?"
> Seems harder if the process tree is more complex and I want "Are these two processes
> both descended from a particular common ancestor?"
> 
> Using fd passing would involve an O(N^2) step where each process talks to each
> other process in turn to complete a link in the mesh of connections. This would need
> to be repeated if additional processes are started.

Wouldn't it be more usual for the parent to create the fd and let all the
children share through it? Even if not necessarily the parent, there can
always be a main process that can send the fd to whoever needs it.

> It would be much nicer to have an operation that matches what the applications
> want to do, namely "I want to broadcast-share this with all my peers".
>
> [N.B. I've suggested that these folks should just re-write their applications to
> simply attach to a giant blob of shared memory, and thus avoid all of this. But
> that doesn't fit for various reasons]

I'm not sure it'd be a good idea to introduce a whole new mode of access
control for this when it's something which can be addressed with more
conventional mechanisms. Maybe it's a bit more upfront work but one-off
security / naming mechanism feels like they'd have a reasonable chance to
cause long term headaches.

Thanks.

-- 
tejun