On 2/20/24 11:55 AM, Huang, Ying wrote: > "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxx> writes: > >> On 2/20/24 6:51 AM, Andrew Morton wrote: >>> On Mon, 19 Feb 2024 14:04:23 +0530 Donet Tom <donettom@xxxxxxxxxxxxx> wrote: >>> >>>>>> --- a/mm/mempolicy.c >>>>>> +++ b/mm/mempolicy.c >>>>>> @@ -2526,7 +2526,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, >>>>>> if (node_isset(curnid, pol->nodes)) >>>>>> goto out; >>>>>> z = first_zones_zonelist( >>>>>> - node_zonelist(numa_node_id(), GFP_HIGHUSER), >>>>>> + node_zonelist(thisnid, GFP_HIGHUSER), >>>>>> gfp_zone(GFP_HIGHUSER), >>>>>> &pol->nodes); >>>>>> polnid = zone_to_nid(z->zone); >>>>> int thisnid = cpu_to_node(thiscpu); >>>>> >>>>> Is there any dofference between numa_node_id() and >>>>> cpu_to_node(raw_smp_processor_id())? And it it explicable that we're >>>>> using one here and not the other? >>>> >>>> Hi Andrew >>>> >>>> Both numa_node_id() and cpu_to_node(raw_smp_processor_id()) return the current execution node id, >>>> Since the current execution node is already fetched at the beginning (thisnid) we can reuse it instead of getting it again. >>> >>> Sure, but mine was a broader thought: why do we have both? Is one >>> preferable and if so why? >> >> IIUC these are two helpers to fetch current numa node id. and either of them can be used based on need. The default implementation shows the details. >> (One small difference is numa_node_id() can use optimized per cpu reader because it is fetching the per cpu variable of the currently running cpu.) >> >> #ifndef numa_node_id >> /* Returns the number of the current Node. */ >> static inline int numa_node_id(void) >> { >> return raw_cpu_read(numa_node); >> } >> #endif >> >> #ifndef cpu_to_node >> static inline int cpu_to_node(int cpu) >> { >> return per_cpu(numa_node, cpu); >> } >> #endif >> >> In mpol_misplaced function, we need the cpu details because we are using that in other place (should_numa_migreate_memory()). So it makes it easy >> to use cpu_to_node(thiscpu) instead of numa_node_id(). > > IIUC, numa_node_id() is faster than cpu_to_node(thiscpu), even if we > have thiscpu already. cpu_to_node() is mainly used to get the node of > NOT current CPU. So, IMHO, we should only use numa_node_id() in this > function. > This change? modified mm/mempolicy.c @@ -2502,8 +2502,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, pgoff_t ilx; struct zoneref *z; int curnid = folio_nid(folio); - int thiscpu = raw_smp_processor_id(); - int thisnid = cpu_to_node(thiscpu); + int thisnid = numa_node_id(); int polnid = NUMA_NO_NODE; int ret = NUMA_NO_NODE; @@ -2573,7 +2572,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, polnid = thisnid; if (!should_numa_migrate_memory(current, folio, curnid, - thiscpu)) + raw_smp_processor_id())) goto out; } -aneesh