On 10/10/18 13:56, Henrik Austad wrote: > On Tue, Oct 09, 2018 at 11:24:26AM +0200, Juri Lelli wrote: > > Hi all, > > Hi, nice series, I have a lot of details to grok, but I like the idea of PE > > > Proxy Execution (also goes under several other names) isn't a new > > concept, it has been mentioned already in the past to this community > > (both in email discussions and at conferences [1, 2]), but no actual > > implementation that applies to a fairly recent kernel exists as of today > > (of which I'm aware of at least - happy to be proven wrong). > > > > Very broadly speaking, more info below, proxy execution enables a task > > to run using the context of some other task that is "willing" to > > participate in the mechanism, as this helps both tasks to improve > > performance (w.r.t. the latter task not participating to proxy > > execution). > > From what I remember, PEP was originally proposed for a global EDF, and as > far as my head has been able to read this series, this implementation is > planned for not only deadline, but eventuall also for sched_(rr|fifo|other) > - is that correct? Correct, this is cross class. > I have a bit of concern when it comes to affinities and and where the > lock owner will actually execute while in the context of the proxy, > especially when you run into the situation where you have disjoint CPU > affinities for _rr tasks to ensure the deadlines. Well, it's the (scheduler context) of the proxy that is potentially moved around. Lock owner stays inside its affinity. > I believe there were some papers circulated last year that looked at > something similar to this when you had overlapping or completely disjoint > CPUsets I think it would be nice to drag into the discussion. Has this been > considered? (if so, sorry for adding line-noise!) I think you refer to BBB work. Not sure if it applies here, though (considering what above). > Let me know if my attempt at translating brainlanguage into semi-coherent > english failed and I'll do another attempt You succeeded! (that's assuming that I got your questions right of course :) > > > This RFD/proof of concept aims at starting a discussion about how we can > > get proxy execution in mainline. But, first things first, why do we even > > care about it? > > > > I'm pretty confident with saying that the line of development that is > > mainly interested in this at the moment is the one that might benefit > > in allowing non privileged processes to use deadline scheduling [3]. > > The main missing bit before we can safely relax the root privileges > > constraint is a proper priority inheritance mechanism, which translates > > to bandwidth inheritance [4, 5] for deadline scheduling, or to some sort > > of interpretation of the concept of running a task holding a (rt_)mutex > > within the bandwidth allotment of some other task that is blocked on the > > same (rt_)mutex. > > > > The concept itself is pretty general however, and it is not hard to > > foresee possible applications in other scenarios (say for example nice > > values/shares across co-operating CFS tasks or clamping values [6]). > > But I'm already digressing, so let's get back to the code that comes > > with this cover letter. > > > > One can define the scheduling context of a task as all the information > > in task_struct that the scheduler needs to implement a policy and the > > execution contex as all the state required to actually "run" the task. > > An example of scheduling context might be the information contained in > > task_struct se, rt and dl fields; affinity pertains instead to execution > > context (and I guess decideing what pertains to what is actually up for > > discussion as well ;-). Patch 04/08 implements such distinction. > > I really like the idea of splitting scheduling ctx and execution context! > > > As implemented in this set, a link between scheduling contexts of > > different tasks might be established when a task blocks on a mutex held > > by some other task (blocked_on relation). In this case the former task > > starts to be considered a potential proxy for the latter (mutex owner). > > One key change in how mutexes work made in here is that waiters don't > > really sleep: they are not dequeued, so they can be picked up by the > > scheduler when it runs. If a waiter (potential proxy) task is selected > > by the scheduler, the blocked_on relation is used to find the mutex > > owner and put that to run on the CPU, using the proxy task scheduling > > context. > > > > Follow the blocked-on relation: > > > > ,-> task <- proxy, picked by scheduler > > | | blocked-on > > | v > > blocked-task | mutex > > | | owner > > | v > > `-- task <- gets to run using proxy info > > > > Now, the situation is (of course) more tricky than depicted so far > > because we have to deal with all sort of possible states the mutex > > owner might be in while a potential proxy is selected by the scheduler, > > e.g. owner might be sleeping, running on a different CPU, blocked on > > another mutex itself... so, I'd kindly refer people to have a look at > > 05/08 proxy() implementation and comments. > > My head hurt already.. :) Eh. I was wondering about putting even more details in the cover. But then I thought that it might have been enough info already for this first spin. Guess we'll have to create proper docs (after how to properly implement this has been agreed upon?). > > Peter kindly shared his WIP patches with us (me, Luca, Tommaso, Claudio, > > Daniel, the Pisa gang) a while ago, but I could seriously have a decent > > look at them only recently (thanks a lot to the other guys for giving a > > first look at this way before me!). This set is thus composed of Peter's > > original patches (which I rebased on tip/sched/core as of today, > > commented and hopefully duly reported in changelogs what have I possibly > > broke) plus a bunch of additional changes that seemed required to make > > all this boot "successfully" on a virtual machine. So be advised! This > > is good only for fun ATM (I actually really hope this is good enough for > > discussion), pretty far from production I'm afraid. Share early, share > > often, right? :-) > > I'll give it a spin and see if it boots, then I probably have a ton of > extra questions :) Thanks! (I honestly expect sparks.. but it'll give us clues what needs to be fixing) Thanks a lot for looking at this. Best, - Juri