In containerized or virtualized environments, there is desire to have controls in place for resources that can be consumed by users of a GPU device. This RFC patch series proposes a framework for integrating use of existing cgroup controllers into device drivers. The i915 driver is updated in this series as our primary use case to leverage this framework and to serve as an example for discussion. The patch series enables device drivers to use cgroups to control the following resources within a GPU (or other accelerator device): * control allocation of device memory (reuse of memcg) and with future work, we could extend to: * track and control share of GPU time (reuse of cpu/cpuacct) * apply mask of allowed execution engines (reuse of cpusets) Instead of introducing a new cgroup subsystem for GPU devices, a new framework is proposed to allow devices to register with existing cgroup controllers, which creates per-device cgroup_subsys_state within the cgroup. This gives device drivers their own private cgroup controls (such as memory limits or other parameters) to be applied to device resources instead of host system resources. Device drivers (GPU or other) are then able to reuse the existing cgroup controls, instead of inventing similar ones. Per-device controls would be exposed in cgroup filesystem as: mount/<cgroup_name>/<subsys_name>.devices/<dev_name>/<subsys_files> such as (for example): mount/<cgroup_name>/memory.devices/<dev_name>/memory.max mount/<cgroup_name>/memory.devices/<dev_name>/memory.current mount/<cgroup_name>/cpu.devices/<dev_name>/cpu.stat mount/<cgroup_name>/cpu.devices/<dev_name>/cpu.weight The drm/i915 patch in this series is based on top of other RFC work [1] for i915 device memory support. AMD [2] and Intel [3] have proposed related work in this area within the last few years, listed below as reference. This new RFC reuses existing cgroup controllers and takes a different approach than prior work. Finally, some potential discussion points for this series: * merge proposed <subsys_name>.devices into a single devices directory? * allow devices to have multiple registrations for subsets of resources? * document a 'common charging policy' for device drivers to follow? [1] https://patchwork.freedesktop.org/series/56683/ [2] https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html [3] https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html Brian Welty (5): cgroup: Add cgroup_subsys per-device registration framework cgroup: Change kernfs_node for directories to store cgroup_subsys_state memcg: Add per-device support to memory cgroup subsystem drm: Add memory cgroup registration and DRIVER_CGROUPS feature bit drm/i915: Use memory cgroup for enforcing device memory limit drivers/gpu/drm/drm_drv.c | 12 + drivers/gpu/drm/drm_gem.c | 7 + drivers/gpu/drm/i915/i915_drv.c | 2 +- drivers/gpu/drm/i915/intel_memory_region.c | 24 +- include/drm/drm_device.h | 3 + include/drm/drm_drv.h | 8 + include/drm/drm_gem.h | 11 + include/linux/cgroup-defs.h | 28 ++ include/linux/cgroup.h | 3 + include/linux/memcontrol.h | 10 + kernel/cgroup/cgroup-v1.c | 10 +- kernel/cgroup/cgroup.c | 310 ++++++++++++++++++--- mm/memcontrol.c | 183 +++++++++++- 13 files changed, 552 insertions(+), 59 deletions(-) -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel