On Wed, Aug 23, 2017 at 8:58 AM, Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> wrote: > > Will you still consider the original patch as a fail safe mechanism? I don't think we have much choice, although I would *really* want to get this root-caused rather than just papering over the symptoms. Maybe still worth testing that "sched/numa: Scale scan period with tasks in group and shared/private" patch that Mel mentioned. In fact, looking at that patch description, it does seem to match this particular load a lot. Quoting from the commit message: "Running 80 tasks in the same group, or as threads of the same process, results in the memory getting scanned 80x as fast as it would be if a single task was using the memory. This really hurts some workloads" So if 80 threads causes 80x as much scanning, a few thousand threads might indeed be really really bad. So once more unto the breach, dear friends, once more. Please. The patch got applied to -tip as commit b5dd77c8bdad, and can be downloaded here: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=b5dd77c8bdada7b6262d0cba02a6ed525bf4e6e1 (Hmm. It says it's cc'd to me, but I never noticed that patch simply because it was in a big group of other -tip commits.. Oh well). Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>