On Mon, Apr 13, 2020 at 03:11:36PM -0400, Tejun Heo wrote: > Hello, Kenny. > > On Tue, Mar 24, 2020 at 02:49:27PM -0400, Kenny Ho wrote: > > Can you elaborate more on what are the missing pieces? > > Sorry about the long delay, but I think we've been going in circles for quite > a while now. Let's try to make it really simple as the first step. How about > something like the following? > > * gpu.weight (should it be gpu.compute.weight? idk) - A single number > per-device weight similar to io.weight, which distributes computation > resources in work-conserving way. > > * gpu.memory.high - A single number per-device on-device memory limit. > > The above two, if works well, should already be plenty useful. And my guess is > that getting the above working well will be plenty challenging already even > though it's already excluding work-conserving memory distribution. So, let's > please do that as the first step and see what more would be needed from there. This agrees with my understanding of the consensus here and what's reasonable possible across different gpus. And in case this isn't clear: This is very much me talking with my drm co-maintainer hat on, not with a gpu vendor hat on (since that's implied somewhere further down the discussion). My understanding from talking with a few other folks is that the cpumask-style CU-weight thing is not something any other gpu can reasonably support (and we have about 6+ of those in-tree), whereas some work-preserving computation resource thing should be doable for anyone with a scheduler. +/- more or less the same issues as io devices, there might be quite bit latencies involved from going from one client to the other because gpu pipelines are deed and pre-emption for gpus rather slow. And ofc not all gpu "requests" use equal amounts of resources (different engines and stuff just to begin with), same way not all io requests are made equal. Plus since we do have a shared scheduler used by at least most drivers, this shouldn't be too hard to get done somewhat consistently across drivers tldr; Acked by me. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch