Hi Vincent and Kevin, thanks a lot for your report. I was also actually experiencing something that I think is related to your issue, but then I didn't find any time to send out a proper patch :/. Could you please test what I've attached and see if it fixes your problem? Thanks a lot, - Juri On 28 August 2014 22:07, Vincent Legout <vincent@xxxxxxxxxxx> wrote: > Hello, > > Juri Lelli <juri.lelli@xxxxxxxxx> writes: > >> On Wed, 2 Jul 2014 17:08:47 -0400 >> Kevin Burns <kevinpb@xxxxxx> wrote: >> >>> Here's the issue: >>> >>> I am able to allocate a bandwidth with a ratio of .1 to two processes using >>> the sched_setattr() system call. >>> >>> I then am able to add said tasks to a cpuset (with one physical processor) >>> using cset. >>> >>> However, when I then try to update the runtime or period of either task, >>> sched_setattr returns a -EBUSY error. >>> >>> Now, if I repeat the above experiment with just one task, I am able to >>> update the runtime or period without issue. I ran trace-cmd and kernelshark >>> to verify that the bandwidths were indeed being updated correctly. That and >>> htop was reporting a higher percentage of CPUusage, which correlated to the >>> ratios of my task's bandwidth. >>> >>> Any ideas as to why cpuset would cause this behaviour? >>> >> >> Could you create a script that I can use to run your setup and reproduce >> the problem? > > Sorry for the delayed answer. I'm working with Kevin and the problem can > be reproduced using the attached files, also available here: > > http://legout.info/~vincent/sd/ > > On a Ubuntu 14.04 system running Linux 3.16, when running run.sh for the > 2nd time, the 2nd call to sched_setattr() returns EBUSY. Uncommenting > line 41 of run.sh fixes this by returning to SCHED_OTHER before moving > the task to the cpuset. > > The problem arises when using both cpusets and SCHED_DEADLINE. The > process described in section 5.1 of the SCHED_DEADLINE documentation > works fine if the process stays on the same cpuset, but I think their > are some issues when moving a process already in the SCHED_DEADLINE > policy from one cpuset to another. > > According to our experiments, it seems that some fields are not updated > during this process, and it thus fails. When a task moves from one > cpuset to another, the total_bw fields of both cpusets doesn't seem to > be updated. Thus, in the next sched_setattr() call, __dl_overflow() > returns 1 because it thinks total_bw is 0 in the new cpuset. Then, > dl_overflow() returns -1 and we have a EBUSY error. > > The total_bw field may also overflow because __dl_clear and __dl_add are > called while the task whose bandwidth is tsk_bw is not in the cpu > represented by dl_b. > > We can get around this by moving the process back to another scheduling > policy before moving it to another cpuset. But we also had to apply the > following patch in order to be sure that the bandwith is always updated > (on top of v3.16). I'd think this condition has been added to skip all > the tests if the bandwith doesn't change. But there is an issue because > then, the total_bw field is not going to be updated for the new cpu. I'd > think the problem comes from the fact that p->dl.dl_bw is not updated > when a task leaves or returns the SCHED_DEADLINE policy. > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index bc1638b..0df3008 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2031,9 +2031,6 @@ static int dl_overflow(struct task_struct *p, int policy, > u64 new_bw = dl_policy(policy) ? to_ratio(period, runtime) : 0; > int cpus, err = -1; > > - if (new_bw == p->dl.dl_bw) > - return 0; > - > /* > * Either if a task, enters, leave, or stays -deadline but changes > * its parameters, we may need to update accordingly the total > > I hope the above explanations make sense and I didn't miss anything > trivial. I'd be happy to provide more information or test anything if > needed. > > Thanks, > Vincent >
From 2243d25fbf0df112a83c12ca8a05e4384ea17058 Mon Sep 17 00:00:00 2001 From: Juri Lelli <juri.lelli@xxxxxxx> Date: Wed, 11 Jun 2014 16:37:31 +0200 Subject: [PATCH] sched/deadline: clear dl_entity params when setscheduling to different class When a task is using SCHED_DEADLINE and the user setschedules it to a different class its sched_dl_entity static parameters are not cleaned up. This causes a bug if the user sets it back to SCHED_DEADLINE with the same parameters again. The problem resides in the check we perform at the very beginning of dl_overflow(): if (new_bw == p->dl.dl_bw) return 0; This condition is met in the case depicted above, so the function returns and dl_b->total_bw is not updated (the p->dl.dl_bw is not added to it). After this, admission control is broken. This patch fixes the thing, properly clearing static parameters for a task that ceases to use SCHED_DEADLINE. Reported-by: Daniele Alessandrelli <daniele.alessandrelli@xxxxxxxxx> Signed-off-by: Juri Lelli <juri.lelli@xxxxxxx> Cc: Ingo Molnar <mingo@xxxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: Juri Lelli <juri.lelli@xxxxxxxxx> Cc: linux-kernel@xxxxxxxxxxxxxxx --- kernel/sched/core.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index ec1a286..01738b2 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1776,6 +1776,21 @@ int wake_up_state(struct task_struct *p, unsigned int state) } /* + * This function clears the sched_dl_entity static params. + */ +static void +__dl_clear_params(struct task_struct *p) +{ + struct sched_dl_entity *dl_se = &p->dl; + + dl_se->dl_runtime = 0; + dl_se->dl_deadline = 0; + dl_se->dl_period = 0; + dl_se->flags = 0; + dl_se->dl_bw = 0; +} + +/* * Perform scheduler related setup for a newly forked process p. * p is forked by current. * @@ -1799,10 +1814,7 @@ static void __sched_fork(unsigned long clone_flags, struct task_struct *p) RB_CLEAR_NODE(&p->dl.rb_node); hrtimer_init(&p->dl.dl_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); - p->dl.dl_runtime = p->dl.runtime = 0; - p->dl.dl_deadline = p->dl.deadline = 0; - p->dl.dl_period = 0; - p->dl.flags = 0; + __dl_clear_params(p); INIT_LIST_HEAD(&p->rt.run_list); @@ -2060,6 +2072,7 @@ static int dl_overflow(struct task_struct *p, int policy, err = 0; } else if (!dl_policy(policy) && task_has_dl_policy(p)) { __dl_clear(dl_b, p->dl.dl_bw); + __dl_clear_params(p); err = 0; } raw_spin_unlock(&dl_b->lock); -- 2.0.4