Hi Jonathan, thanks for the comments. > -----Original Message----- > From: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> > Sent: Monday, July 6, 2020 6:46 PM > To: Justin He <Justin.He@xxxxxxx> > Cc: Catalin Marinas <Catalin.Marinas@xxxxxxx>; Will Deacon > <will@xxxxxxxxxx>; Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; Mike > Rapoport <rppt@xxxxxxxxxxxxx>; Baoquan He <bhe@xxxxxxxxxx>; Chuhong Yuan > <hslester96@xxxxxxxxx>; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux- > kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; Kaly Xin <Kaly.Xin@xxxxxxx> > Subject: Re: [PATCH 1/3] arm64/numa: set numa_off to false when numa node > is fake > > On Mon, 6 Jul 2020 11:29:21 +0100 > Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote: > > > On Mon, 6 Jul 2020 09:19:45 +0800 > > Jia He <justin.he@xxxxxxx> wrote: > > > > Hi, > > > > > Previously, numa_off is set to true unconditionally in > dummy_numa_init(), > > > even if there is a fake numa node. > > > > > > But acpi will translate node id to NUMA_NO_NODE(-1) in > acpi_map_pxm_to_node() > > > because it regards numa_off as turning off the numa node. > > > > That is correct. It is operating exactly as it should, if SRAT hasn't > been parsed > > and you are on ACPI platform there are no nodes. They cannot be created > at > > some later date. The dummy code doesn't change this. It just does > enough to carry > > on operating with no specified nodes. > > > > > > > > Without this patch, pmem can't be probed as a RAM device on arm64 if > SRAT table > > > isn't present. > > > > > > $ndctl create-namespace -fe namespace0.0 --mode=devdax --map=dev -s 1g > -a 64K > > > kmem dax0.0: rejecting DAX region [mem 0x240400000-0x2bfffffff] with > invalid node: -1 > > > kmem: probe of dax0.0 failed with error -22 > > > > > > This fixes it by setting numa_off to false. > > > > Without the SRAT protection patch [1] you may well run into problems Sorry, doesn't quite understand here. Do you mean your [1] can resolve this issue? But acpi_map_pxm_to_node() has returned with NUMA_NO_NODE after following check: if (pxm < 0 || pxm >= MAX_PXM_DOMAINS || numa_off) return NUMA_NO_NODE; Seems even with your [1] patch, it is not helpful? Thanks for clarification if my understanding is wrong. [1] https://patchwork.kernel.org/patch/11632063/ > > because someone somewhere will have _PXM in a DSDT but will > > have a non existent SRAT. We had this happen on an AMD platform when > we > > tried to introduce working _PXM support for PCI. [2] > > > > So whilst this seems superficially safe, I'd definitely be crossing your > fingers. > > Note, at that time I proposed putting the numa_off = false into the x86 > code > > path precisely to cut out that possibility (was rejected at the time, at > least > > partly because the clarifications to the ACPI spec were not pubilc.) > > > > The patch in [1] should sort things out however by ensuring we only > create > > new domains where we should actually be doing so. However, in your case > > it will return NUMA_NO_NODE anyway so this isn't the right way to fix > things. Okay, let me try to summarize, there might be 3 possible fixing ways: 1. this patch, seems it is not satisfied by you and David 😉 2. my previous proposal [2], similar as what David suggested 3. remove numa_off check in acpi_map_pxm_to_node() e.g. ... if (pxm < 0 || pxm >= MAX_PXM_DOMAINS /*|| numa_off*/) return NUMA_NO_NODE; [2] https://lkml.org/lkml/2019/8/16/367 -- Cheers, Justin (Jia He)