On Thu, May 06, 2021 at 10:06:32PM -0400, Kenny Ho wrote: > Sorry for the late reply (I have been working on other stuff.) > > On Fri, Feb 5, 2021 at 8:49 AM Daniel Vetter <daniel@xxxxxxxx> wrote: > > > > So I agree that on one side CU mask can be used for low-level quality > > of service guarantees (like the CLOS cache stuff on intel cpus as an > > example), and that's going to be rather hw specific no matter what. > > > > But my understanding of AMD's plans here is that CU mask is the only > > thing you'll have to partition gpu usage in a multi-tenant environment > > - whether that's cloud or also whether that's containing apps to make > > sure the compositor can still draw the desktop (except for fullscreen > > ofc) doesn't really matter I think. > This is not correct. Even in the original cgroup proposal, it > supports both mask and count as a way to define unit(s) of sub-device. > For AMD, we already have SRIOV that supports GPU partitioning in a > time-sliced-of-a-whole-GPU fashion. Hm I missed that. I feel like time-sliced-of-a-whole gpu is the easier gpu cgroups controler to get started, since it's much closer to other cgroups that control bandwidth of some kind. Whether it's i/o bandwidth or compute bandwidht is kinda a wash. CU mask feels a lot more like an isolation/guaranteed forward progress kind of thing, and I suspect that's always going to be a lot more gpu hw specific than anything we can reasonably put into a general cgroups controller. Also for the time slice cgroups thing, can you pls give me pointers to these old patches that had it, and how it's done? I very obviously missed that part. Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch