Re: [PATCH v2 1/3] arm64/numa: export memory_add_physaddr_to_nid as EXPORT_SYMBOL_GPL

Mike Rapoport <rppt@xxxxxxxxxxxxx> · Wed, 8 Jul 2020 10:43:25 +0300

On Wed, Jul 08, 2020 at 09:16:01AM +0200, David Hildenbrand wrote:
> On 08.07.20 09:04, Dan Williams wrote:
> > On Tue, Jul 7, 2020 at 11:59 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
> >>
> >> On 08.07.20 08:22, Mike Rapoport wrote:
> >>> On Tue, Jul 07, 2020 at 09:27:43PM -0700, Dan Williams wrote:
> >>>> On Tue, Jul 7, 2020 at 9:08 PM Justin He <Justin.He@xxxxxxx> wrote:
> >>>> [..]
> >>>>>> Especially for architectures that use memblock info for numa info
> >>>>>> (which seems to be everyone except x86) why not implement a generic
> >>>>>> memory_add_physaddr_to_nid() that does:
> >>>>>>
> >>>>>> int memory_add_physaddr_to_nid(u64 addr)
> >>>>>> {
> >>>>>>         unsigned long start_pfn, end_pfn, pfn = PHYS_PFN(addr);
> >>>>>>         int nid;
> >>>>>>
> >>>>>>         for_each_online_node(nid) {
> >>>>>>                 get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
> >>>>>>                 if (pfn >= start_pfn && pfn <= end_pfn)
> >>>>>>                         return nid;
> >>>>>>         }
> >>>>>>         return NUMA_NO_NODE;
> >>>>>> }
> >>>>>
> >>>>> Thanks for your suggestion,
> >>>>> Could I wrap the codes and let memory_add_physaddr_to_nid simply invoke
> >>>>> phys_to_target_node()?
> >>>>
> >>>> I think it needs to be the reverse. phys_to_target_node() should call
> >>>> memory_add_physaddr_to_nid() by default, but fall back to searching
> >>>> reserved memory address ranges in memblock. See phys_to_target_node()
> >>>> in arch/x86/mm/numa.c. That one uses numa_meminfo instead of memblock,
> >>>> but the principle is the same i.e. that a target node may not be
> >>>> represented in memblock.memory, but memblock.reserved. I'm working on
> >>>> a patch to provide a function similar to get_pfn_range_for_nid() that
> >>>> operates on reserved memory.
> >>>
> >>> Do we really need yet another memblock iterator?
> >>> I think only x86 has memory that is not in memblock.memory but only in
> >>> memblock.reserved.
> >>
> >> Reading about abusing the memblock allcoator once again in memory
> >> hotplug paths makes me shiver.
> > 
> > Technical reasoning please?
> 
> ARCH_KEEP_MEMBLOCK is (AFAIK) only a hack for arm64 to implement
> pfn_valid(), because they zap out individual pages corresponding to
> memory holes of full sections.
> 
> I am not a friend of adding more post-init code to rely on memblock
> data. It just makes it harder to eventually get rid of ARCH_KEEP_MEMBLOCK.

The most heavy user of memblock in post-init code is powerpc. It won't
be easy to get rid of it there.

> > arm64 numa information is established from memblock data. It seems
> > counterproductive to ignore that fact if we're already touching
> > memory_add_physaddr_to_nid() and have a use case for a driver to call
> > it.
> 
> ... and we are trying to handle the "only a single dummy node" case
> (patch #2), or what am I missing? What is there to optimize currently?
> 
> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.