On Fri, Jul 17, 2009 at 02:31:50PM +0200, Peter Zijlstra wrote: > On Fri, 2009-07-17 at 12:25 +0000, Thomas Gleixner wrote: > > plain text document attachment (freezer-fix-accounting-for-real.patch) > > commit e3c8ca8336 (sched: do not count frozen tasks toward load) broke > > the nr_uninterruptible accounting on freeze/thaw. On freeze the task > > is excluded from accounting with a check for (task->flags & > > PF_FROZEN), but that flag is cleared before the task is thawed. So > > while we prevent that the freezing task with state > > TASK_UNINTERRUPTIBLE is accounted to nr_uninterruptible we decrement > > nr_uninterruptible on thaw. > > > > Use a separate flag which is handled by the freezing task itself. Set > > it before calling the scheduler with TASK_UNINTERRUPTIBLE state and > > clear it after we return from frozen state. > > Right, so I'm wondering why we don't fully revert e3c8ca8336 to begin > with. > > The changelog reads: > > --- > commit e3c8ca8336707062f3f7cb1cd7e6b3c753baccdd > Author: Nathan Lynch <ntl@xxxxxxxxx> > Date: Wed Apr 8 19:45:12 2009 -0500 > > sched: do not count frozen tasks toward load > > Freezing tasks via the cgroup freezer causes the load average to climb > because the freezer's current implementation puts frozen tasks in > uninterruptible sleep (D state). > > Some applications which perform job-scheduling functions consult the > load average when making decisions. If a cgroup is frozen, the load > average does not provide a useful measure of the system's utilization > to such applications. This is especially inconvenient if the job > scheduler employs the cgroup freezer as a mechanism for preempting low > priority jobs. Contrast this with using SIGSTOP for the same purpose: > the stopped tasks do not count toward system load. > > Change task_contributes_to_load() to return false if the task is > frozen. This results in /proc/loadavg behavior that better meets > users' expectations. > --- > > It appears to me that a frozen cgroup is a transient state. Either you > would typically do something like: > > freeze -> {snapshot, migrate} -> {thaw, destroy} > > Therefore a short increase in load doesn't seem like too big a problem, > its going to be gone soon anyway. > > Hmm? The job scheduler in question does not use FROZEN as a transient state and does not use checkpoint/restart at all since c/r is still a work in progress. Even when used for power management it seems wrong to count frozen tasks towards the loadavg since they aren't using CPU time or waiting for IO. Cheers, -Matt Helsley _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm