On Mon, May 07, 2018 at 04:06:07AM -0700, Srikar Dronamraju wrote: > > @@ -1876,7 +1877,18 @@ static void numa_migrate_preferred(struct task_struct *p) > > > > /* Periodically retry migrating the task to the preferred node */ > > interval = min(interval, msecs_to_jiffies(p->numa_scan_period) / 16); > > - p->numa_migrate_retry = jiffies + interval; > > + numa_migrate_retry = jiffies + interval; > > + > > + /* > > + * Check that the new retry threshold is after the current one. If > > + * the retry is in the future, it implies that wake_affine has > > + * temporarily asked NUMA balancing to backoff from placement. > > + */ > > + if (numa_migrate_retry > p->numa_migrate_retry) > > + return; > > The above check looks wrong. This check will most likely to be true, > numa_migrate_preferred() itself is called either when jiffies > > p->numa_migrate_retry or if the task's numa_preferred_nid has changed. > Sorry for the delay getting back -- viral infections combined with a public day off is slowing me. You're right, without affine wakeups with a wakeup-intensive workload the path may never be hit and with the current code, it effectively acts as a broken throttling mechanism. However, I've confirmed that "fixing" it has mixed results with many regressions on x86 for both 2 and 4 socket boxes. I need time to think about it and see if this can be fixed without introducing another regression. I'll also check if a plain revert is the way to go for a short-term fix and then revisit it. Thanks Srikar. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
![]() |