Re: [RFC] Generic cgroup controller for the gpu/drm subsystem

Tejun Heo <tj@xxxxxxxxxx> · Tue, 4 Dec 2018 13:43:48 -0800

Hello,

On Mon, Oct 29, 2018 at 07:49:13PM -0400, Kenny Ho wrote:
> Unlike a computer, however, GPUs are not as well "standardized" as a
> desktop or a server.  For the gpu/drm subsystem, there are something
> that are common (such as buffer sharing and buffer lifetime
> management), something that are shared by some vendors (software
> scheduler) and something that are very much vendor specific.  Due to
> this, a generic cgroup controller for drm may need to be more
> pluggable than other cgroup controller.  We took a look at the rdma
> cgroup as part of our research but rdma appears to have resources that
> are more abstracted and standardized.
> 
> What do you think?  Does drm/gpu warrant its own full-fledged cgroup
> controller?

First of all, the summary is much appreciated.  Here are my two cents.

I think it could help a lot to think about what eventual features
users would want instead of specific hardware details.  The hardwares
might not be all that standardized but what users would want in terms
of resource control wouldn't vary much - e.g. "this is more important,
that's less, but I don't want to leave the device idle while there's
work to do" or "this guy paid me X while that guy Y, let's make sure
each gets what they paid for".

Rather than trying to build up the interface from what each device can
do, trying building it down from high level user needs, I believe, has
a better chance of reaching an interface which a wide audience would
find useful and can stand the test of the time.  IOW, make the
interface about the user intentions rather than underlying
implementation details.  In the long term, this would also help us
(kernel devs) as implementation details aren't locked into widely used
interface.

Also, I wouldn't recommend using rdma as the benchmark.  While rdma
has abstract standard resources defined and the controller distributes
numeric amounts of them, what they mean to users is poorly defined,
unintuitive or difficult to use.  That was what rdma could do given
the circumstances of the area (it's really difficult to define work or
cost metric for IO devices) but I think gpus have a lot better chance
of reaching something which is a lot more meaningful and useful.

As the first (rather challenging) step, what's likely to be the most
useful to the widest audience is work-conserving proportional control
- gpu.weight.  Implementation strategy can differ across gpu vendors
but the concept is as universal as it gets - A should be able to do X
times more work than B.

Thanks.

-- 
tejun