On Thu, Jun 17, 2021 at 08:55:44PM +0900, Janghyuck Kim wrote: > On Wed, Jun 16, 2021 at 06:32:50PM +0100, Matthew Wilcox wrote: > > On Wed, Jun 16, 2021 at 05:37:41PM +0900, Janghyuck Kim wrote: > > > Architecture might support fake node when CONFIG_NUMA is enabled but any > > > node settings were supported by ACPI or device tree. In this case, > > > getting memory policy during memory allocation path is meaningless. > > > > > > Moreover, performance degradation was observed in the minor page fault > > > test, which is provided by (https://protect2.fireeye.com/v1/url?k=c81407ae-978f3ea4-c8158ce1-0cc47a31384a-10187d5ead74c318&q=1&e=cbc91c9b-80e1-4ca0-b51a-9f79fad5b0c1&u=https%3A%2F%2Flkml.org%2Flkml%2F2006%2F8%2F29%2F294). > > > Average faults/sec of enabling NUMA with fake node was 5~6 % worse than > > > disabling NUMA. To reduce this performance regression, fastpath is > > > introduced. fastpath can skip the memory policy checking if NUMA is > > > enabled but it uses fake node. If architecture doesn't support fake > > > node, fastpath affects nothing for memory allocation path. > > > > This patch doesn't even apply to the current kernel, but putting that > > aside, what's the expensive part of the current code? That is, > > comparing performance stats between this numa_off enabled and numa_off > > disabled, where do you see taking a lot of time? > > > > mempolicy related code that I skipped by this patch took a short time, > taking only a few tens of nanoseconds that difficult to measure by > sched_clock's degree of precision. But it can be affect the minor page > fault test with large buffer size, because one page fault handling takes > several ms. As I replied in previous mail, performance regression has > been reduced from 5~6% to 2~3%. I'm not proposing you use sched_clock. Try perf.