On 23/08/18 12:43, luca abeni wrote: > On Thu, 23 Aug 2018 12:23:50 +0200 > Juri Lelli <juri.lelli@xxxxxxxxx> wrote: [...] > > But then what is a sane inheritance mechanism? > > In my understanding (please correct me if I am wrong), this is an > orthogonal issue: if I understand well, the issue preventing non-root > usage of SCHED_DEADLINE is that a task inheriting a dl entity is not > throttled (when the current runtime rrives to 0, the deadline is > postponed, but the task stays schedulable). So, I think that removing > this behaviour should allow to use SCHED_DEADLINE without starving > other tasks... Right, potential starvation would be gone... > Then, there is the issue about the deadline and runtime to inherit. And > I agree that this is important (and the solution is not easy), but you > have this issue even if you use the current "dl_boosted" behaviour... > No? ... but, while it's true that current inheritance has problems w/ o w/o boosting behaviour, I wonder if the problem might be more painful for a normal user that it's still under runtime enforcement. Disabling enforcement seems to hide a bit the fact that we need proper inheritance. :-( > > Walk the chain and find > > the next potential deadline to inherit for the current boosted (still > > runtime enforced) task before throttling it? Not sure it's going to be > > any easier than the proxy solution. :-/ > > Right; this is not easy... But I think it is not related with the issue > we are discussing (if I understand this email thread well). Yes, it has > to be fixed, but it does not prevent non-root usage. Or am I missing > something? It depends on how bad we think it's what I said above I guess. > > > 2) Implement some mechanism to limit the amount of dl bandwidth a > > > user can allocate to itself (I think the cgroup-based approach we > > > discussed some time ago should be OK... If I remember well, you > > > even had a patch implementing it :) > > > > I think most (all?) distributions today run with CONFIG_RT_GROUP_SCHED > > disabled, we should also think about a solution that doesn't rely on > > that interface. > > I guess CONFIG_RT_GROUP_SCHED is often disabled because it ends up > changing the "traditional" SCHED_{RR/FIFO} behaviour. So, maybe the > solution is to have a different dl_{runtime,period} interface in > control groups (enabled by CONFIG_DL_GROUP_SCHED :). > CONFIG_DL_GROUP_SCHED does not change the scheduling behaviour, but > only the admission test... So, enabling it could be safer than enabling > CONFIG_RT_GROUP_SCHED. Not sure if adding yet another config switch is acceptable. FWIW, I'd prefer not to add it, also looking back at all the problems RT_GROUP_ SCHED seems to pose/have posed, but if turns to be the only option.. > > Maybe the existing system wide sched_rt_{period, > > runtime}_us are enough? > > I do not know... A cgroup-based interface looks more powerful (and not > so difficult to implement... :), and would allow the sysadmin to decide > which users can use SCHED_DEADLINE, how much SCHED_DEADLINE bandwidth > can each user/group use, etc... How about extending PAM limits instead? It looks like it's what (e.g.) audio users rely on already [1]. It is maybe possible to add dlruntime, dlperiod, dldeadline parameters in there? 1 - https://fedoraproject.org/wiki/JACK_Audio_Connection_Kit#Running_Jack_in_Realtime_mode