On Thu, Oct 25, 2012 at 5:16 AM, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote: > + /* > + * Using runtime rather than walltime has the dual advantage that > + * we (mostly) drive the selection from busy threads and that the > + * task needs to have done some actual work before we bother with > + * NUMA placement. > + */ That explanation makes sense.. > + now = curr->se.sum_exec_runtime; > + period = (u64)curr->numa_scan_period * NSEC_PER_MSEC; > + > + if (now - curr->node_stamp > period) { > + curr->node_stamp = now; > + > + if (!time_before(jiffies, curr->mm->numa_next_scan)) { .. but then the whole "numa_next_scan" thing ends up being about real-time anyway? So 'numa_scan_period' in in CPU time (msec, converted to nsec at runtime rather than when setting it), but 'numa_next_scan' is in wallclock time (jiffies)? But *both* of them are based on the same 'numa_scan_period' thing that the user sets in ms. So numa_scan_period is interpreted as both wallclock *and* as runtime? Maybe this works, but it doesn't really make much sense. And what is the impact of this on machines that run lots of loads with delays (whether due to IO or timers)? Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>