Re: DRM cgroups integration (Was: Re: [PATCH v4 0/8] cgroup private data and DRM/i915 integration)

Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> · Thu, 05 Apr 2018 17:15:13 +0300



+ Some more Cc's based on IRC discussion

Quoting Joonas Lahtinen (2018-04-05 16:46:51)
> + Dave for commenting from DRM subsystem perspective. I strongly believe
> there would be benefit from agreeing on some foundation of DRM subsystem
> level program GPU niceness [-20,19] and memory limit [0,N] pages.
> 
> Quoting Matt Roper (2018-03-30 03:43:13)
> > On Mon, Mar 26, 2018 at 10:30:23AM +0300, Joonas Lahtinen wrote:
> > > Quoting Matt Roper (2018-03-23 17:46:16)
> > > > On Fri, Mar 23, 2018 at 02:15:38PM +0200, Joonas Lahtinen wrote:
> > > > > Quoting Matt Roper (2018-03-17 02:08:57)
> > > > > > This is the fourth iteration of the work previously posted here:
> > > > > >   (v1) https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
> > > > > >   (v2) https://www.mail-archive.com/dri-devel@xxxxxxxxxxxxxxxxxxxxx/msg208170.html
> > > > > >   (v3) https://lists.freedesktop.org/archives/intel-gfx/2018-March/157928.html
> > > > > > 
> > > > > > The high level goal of this work is to allow non-cgroup-controller parts
> > > > > > of the kernel (e.g., device drivers) to register their own private
> > > > > > policy data for specific cgroups.  That mechanism is then made use of in
> > > > > > the i915 graphics driver to allow GPU priority to be assigned according
> > > > > > to the cgroup membership of the owning process.  Please see the v1 cover
> > > > > > letter linked above for a more in-depth explanation and justification.
> > > > > 
> > > > > Hi Matt,
> > > > > 
> > > > > For cross-subsystem changes such as this, it makes sense to Cc all
> > > > > relevant maintainers, especially if there have been previous comments to
> > > > > earlier revisions.
> > > > > 
> > > > > Please, do include and keep a reference to the userspace portion of the
> > > > > changes when you suggest new uAPI to be added. At least I have some trouble
> > > > > trying to track down the relevant interface consumer here.
> > > > > 
> > > > > I'm unsure how much sense it makes to commence with detailed i915 review
> > > > > if we will be blocked by lack of userspace after that? I'm assuming
> > > > > you've read through [1] already.
> > > > 
> > > > Hi Joonas,
> > > > 
> > > > I've sent the userspace code out a few times, but it looks like I forgot
> > > > to include a copy with v4/v4.5.  Here's the version I provided with v3:
> > > >   https://lists.freedesktop.org/archives/intel-gfx/2018-March/157935.html
> > > 
> > > Thanks. Keeping that in the relevant commit message of the patch that
> > > introduces the new uAPI will make it harder to forget and easiest for
> > > git blame, too.
> > > 
> > > > 
> > > > Usually we don't consider things like i-g-t to be sufficient userspace
> > > > consumers because we need a real-world consumer rather than a "toy"
> > > > userspace.  However in this case, the i-g-t tool, although very simple,
> > > > is really the only userspace consumer I expect there to ever be.
> > > > Ultimately the true consumer of this cgroups work are bash scripts, sysv
> > > > init scripts, systemd recipes, etc.  that just need a very simple tool
> > > > to assign the specific values that make sense on a given system.
> > > > There's no expectation that graphics clients or display servers would
> > > > ever need to make use of these interfaces.
> > > 
> > > I was under the impression that a bit more generic GPU cgroups support
> > > was receiving a lot of support in the early discussion? A dedicated
> > > intel_cgroup sounds underwhelming, when comparing to idea of "gpu_nice",
> > > for user adoption :)
> > 
> > I'm open to moving the cgroup_priv registration/lookup to the DRM core
> > if other drivers are interested in using this mechanism and if we can
> > come to an agreement on a standard priority offset range to support, how
> > display boost should work for all drivers, etc.  There might be some
> > challenges mapping a DRM-defined priority range down to a different
> > range that makes sense for individual driver schedulers, especially
> > since some drivers already expose a different priority scheme to
> > userspace via other interfaces like i915 does with GEM context priority.
> > 
> > So far I haven't really heard any interest outside the Intel camp, but
> > hopefully other driver teams can speak up if they're for/against this.
> > I don't want to try to artificially standardize this if other drivers
> > want to go a different direction with priority/scheduling that's too
> > different from the current Intel-driven design.
> 
> I don't think there are that many directions to go about GPU context
> priority, considering we have the EGL_IMG_context_priority extension, so
> it'll only be about granularity of the scale.
> 
> I would suggest to go with the nice like scale for easy user adoption,
> then just apply that as the N most significant bits.
> 
> The contexts could then of course further adjust their priority from what
> is set by the "gpu_nice" application with the remaining bits.
> 
> I'm strongly feeling this should be a DRM level "gpu_nice". And the
> binding to cgroups should come through DRM core. If it doesn't, limiting
> the amount of memory used becomes awkward as the allocation is
> centralized to DRM core.
> 
> > > Also, I might not be up-to-date about all things cgroups, but the way
> > > intel_cgroup works, feels bit forced. We create a userspace context just
> > > to communicate with the driver and the IOCTL will still have global
> > > effects. I can't but think that i915 reading from the cgroups subsystem
> > > for the current process would feel more intuitive to me.
> > 
> > I think you're referring to the earlier discussion about exposing
> > priority directly via the cgroups filesystem?  That would certainly be
> > simpler from a userspace perspective, but it's not the direction that
> > the cgroups maintainer wants to see things go.  Adding files directly to
> > the cgroups filesystem is supposed to be something that's reserved for
> > official cgroups controllers.  The GPU priority concept we're trying to
> > add here doesn't align with the requirements for creating a controller,
> > so the preferred approach is to create a custom interface (syscall or
> > ioctl) that simply takes a cgroup as a parameter.  There's precendent
> > with similar interfaces in areas like BPF (where the bpf() system call
> > can accept a cgroup as a parameter and then perform its own private
> > policy changes as it sees fit).
> > 
> > Using a true cgroups controller and exposing settings via the filesystem
> > is likely still the way we'll want to go for some other types of
> > cgroups-based policy in the future (e.g., limiting GPU memory usage); it
> > just isn't the appropriate direction for priority.
> 
> Might be just me but feels bit crazy to be setting GPU memory usage
> through another interface and then doing i915 specific IOCTLs to control
> the priority of that same cgroup.
> 
> I don't feel comfortable adding custom cgroups dependent IOCTLs to i915
> where cgroups is only working as the variable carrier in background. We
> should really just be consuming a variable from cgroups and it should be
> set outside of of the i915 IOCTL interface.
> 
> I'm still seeing that we should have a DRM cgroups controller and a DRM
> subsystem wide application to control the priority and memory usage
> to be fed to the drivers.
> 
> If we end up just supporting i915 apps, we could as well use LD_PRELOAD
> wrapper and alter the context priority at creation time for exactly the
> same effect and no extra interfaces to maintain.
> 
> > > Does the implementation mimic some existing cgroups tool or de-facto way
> > > of doing things in cgroups world?
> > 
> > The ioctl approach I took is similar to syscall approach that the BPF
> > guys use to attach BPF programs to a cgroup.  I'm not very familiar with
> > BPF or how it gets used from userspace, so I'm not sure whether the
> > interface is intended for one specific tool (like ours is), or whether
> > there's more variety for userspace consumers.
> 
> Is the proposal to set the memory usage from similar interface, or is
> that still not implemented?
> 
> I'm seeing a very close relation between time-slicing GPU time and
> allowed GPU buffer allocations, so having two completely different
> interfaces does just feel very hackish way of implementing this.
> 
> Regards, Joonas
> 
> > 
> > 
> > Matt
> > 
> > > 
> > > Regards, Joonas
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe cgroups" in
> > > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > -- 
> > Matt Roper
> > Graphics Software Engineer
> > IoTG Platform Enabling & Development
> > Intel Corporation
> > (916) 356-2795
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html