Re: [PATCH 1/3] sched: fix cgroup movement of newly created process

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/12/2011 10:57 PM, Daisuke Nishimura wrote:

> There is a small race between do_fork() and sched_move_task(), which is trying
> to move the child.
>
>             do_fork()                 sched_move_task()
> --------------------------------+---------------------------------
>   copy_process()
>     sched_fork()
>       task_fork_fair()
>         -> vruntime of the child is initialized
>            based on that of the parent.


Hmm, so here vruntime of child is actually initialized to vruntime - min_V

>   -> we can see the child in "tasks" file now.
>                                     task_rq_lock()
>                                     task_move_group_fair()


So since on a regular fork we just copy and don't actually go through
the attach muck I'm assuming this is an external actor who's seen the
child in the tasks file and is moving it?

>                                       ->child.se.vruntime -= (old)cfs_rq->min_vruntime
>                                       ->child.se.vruntime += (new)cfs_rq->min_vruntime


This would then add delta min_V between the two cfs_rqs

>                                     task_rq_unlock()
>   wake_up_new_task()
>     ...
>     enqueue_entity()
>       child.se->vruntime += cfs_rq->min_vruntime

>
> As a result, vruntime of the child becomes far bigger than min_vruntime,
> if (new)cfs_rq->min_vruntime >> (old)cfs_rq->min_vruntime.
>
> This patch fixes this problem by just ignoring such process in task_move_group_fair(),
> because the vruntime has already been normalized in task_fork_fair().
>
> Signed-off-by: Daisuke Nishimura <nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@xxxxxxxxxxxxxxxx>
> ---This would need an explanatory
>  kernel/sched_fair.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index 5c9e679..df145a9 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -4922,10 +4922,10 @@ static void task_move_group_fair(struct task_struct *p, int on_rq)
>  	 * to another cgroup's rq. This does somewhat interfere with the
>  	 * fair sleeper stuff for the first placement, but who cares.
>  	 */
> -	if (!on_rq)
> +	if (!on_rq && p->state != TASK_RUNNING)
>  		p->se.vruntime -= cfs_rq_of(&p->se)->min_vruntime;


We also have the choice of keying off something like
!se.sum_exec_runtime here which is reset in sched_fork() which might
be less fragile/make the problem interaction more obvious to.  Either
way this is really a corner case deserving of an explanatory comment.
This is a little icky but I don't have any wildly better ideas.

>  	set_task_rq(p, task_cpu(p));
> -	if (!on_rq)
> +	if (!on_rq && p->state != TASK_RUNNING)
>  		p->se.vruntime += cfs_rq_of(&p->se)->min_vruntime;
>  }
>  #endif


Acked-by: Paul Turner <pjt@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux