Hello, On Mon, Apr 13, 2020 at 04:17:14PM -0400, Kenny Ho wrote: > Perhaps we can even narrow things down to just > gpu.weight/gpu.compute.weight as a start? In this aspect, is the key That sounds great to me. > objection to the current implementation of gpu.compute.weight the > work-conserving bit? This work-conserving requirement is probably > what I have missed for the last two years (and hence going in circle.) > > If this is the case, can you clarify/confirm the followings? > > 1) Is resource scheduling goal of cgroup purely for the purpose of > throughput? (at the expense of other scheduling goals such as > latency.) It's not; however, work-conserving mechanisms are the easiest to use (cuz you don't lose anything) while usually challenging to implement. It tends to clarify how control mechanisms should be structured - even what resources are. > 2) If 1) is true, under what circumstances will the "Allocations" > resource distribution model (as defined in the cgroup-v2) be > acceptable? Allocations definitely are acceptable and it's not a pre-requisite to have work-conserving control first either. Here, given the lack of consensus in terms of what even constitute resource units, I don't think it'd be a good idea to commit to the proposed interface and believe it'd be beneficial to work on interface-wise simpler work conserving controls. > 3) If 1) is true, are things like cpuset from cgroup v1 no longer > acceptable going forward? Again, they're acceptable. > To be clear, while some have framed this (time sharing vs spatial > sharing) as a partisan issue, it is in fact a technical one. I have > implemented the gpu cgroup support this way because we have a class of > users that value low latency/low jitter/predictability/synchronicity. > For example, they would like 4 tasks to share a GPU and they would > like the tasks to start and finish at the same time. > > What is the rationale behind picking the Weight model over Allocations > as the first acceptable implementation? Can't we have both > work-conserving and non-work-conserving ways of distributing GPU > resources? If we can, why not allow non-work-conserving > implementation first, especially when we have users asking for such > functionality? I hope the rationales are clear now. What I'm objecting is inclusion of premature interface, which is a lot easier and more tempting to do for hardware-specific limits and the proposals up until now have been showing ample signs of that. I don't think my position has changed much since the beginning - do the difficult-to-implement but easy-to-use weights first and then you and everyone would have a better idea of what hard-limit or allocation interfaces and mechanisms should look like, or even whether they're needed. Thanks. -- tejun