Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Thu, 25 Oct 2012 13:53:05 -0700

On Thu, Oct 25, 2012 at 5:16 AM, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
> +       /*
> +        * Using runtime rather than walltime has the dual advantage that
> +        * we (mostly) drive the selection from busy threads and that the
> +        * task needs to have done some actual work before we bother with
> +        * NUMA placement.
> +        */

That explanation makes sense..

> +       now = curr->se.sum_exec_runtime;
> +       period = (u64)curr->numa_scan_period * NSEC_PER_MSEC;
> +
> +       if (now - curr->node_stamp > period) {
> +               curr->node_stamp = now;
> +
> +               if (!time_before(jiffies, curr->mm->numa_next_scan)) {

.. but then the whole "numa_next_scan" thing ends up being about
real-time anyway?

So 'numa_scan_period' in in CPU time (msec, converted to nsec at
runtime rather than when setting it), but 'numa_next_scan' is in
wallclock time (jiffies)?

But *both* of them are based on the same 'numa_scan_period' thing that
the user sets in ms.

So numa_scan_period is interpreted as both wallclock *and* as runtime?

Maybe this works, but it doesn't really make much sense. And what is
the impact of this on machines that run lots of loads with delays
(whether due to IO or timers)?

                     Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>