RE: [PATCH v3] ACPI: NUMA: Add a node and memblk for each CFMWS not in SRAT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Alison Schofield <alison.schofield@xxxxxxxxx>
> Sent: Wednesday, October 20, 2021 8:00 PM
> To: Vikram Sethi <vsethi@xxxxxxxxxx>
> Cc: Rafael J. Wysocki <rafael@xxxxxxxxxx>; Len Brown <lenb@xxxxxxxxxx>;
> Vishal Verma <vishal.l.verma@xxxxxxxxx>; Ira Weiny <ira.weiny@xxxxxxxxx>;
> Ben Widawsky <ben.widawsky@xxxxxxxxx>; Dan Williams
> <dan.j.williams@xxxxxxxxx>; linux-cxl@xxxxxxxxxxxxxxx; linux-
> acpi@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH v3] ACPI: NUMA: Add a node and memblk for each
> CFMWS not in SRAT
>
snip
 
> > >
> > > Consumers can use phys_to_target_node() to discover the NUMA node.
> >
> > Does this patch work for CXL type 2 memory which is not in SRAT? A
> > type 2 driver can find its HDM BASE physical address from its CXL
> > registers and figure out its NUMA node id by calling phys_to_target_node?
> 
> Yes. This adds the nodes for the case where the BIOS doesn't fully describe
> everything in CFMWS in the SRAT. And, yes, that is how the NUMA node can
> be discovered.
> 
> > Or is type 2 HDM currently being skipped altogether?
> 
> Not sure what you mean by 'being skipped altogether'? The BIOS may
> describe (all or none or some) of CXL Memory in the SRAT. In the case where
> BIOS describes it all, NUMA nodes will already exist, and no new nodes will
> be added here.
> 
My question about skipping type2 wasn't directly related to your patch,
but more of a question about current upstream support for probe/configuration
of type 2 accelerator devices memory, irrespective of whether FW shows type 2 
memory in SRAT.
The desired outcome is that the kernel CXL driver recognizes such type 2 HDM, 
and assigns it a NUMA node such that the type 2 driver can later add/online this memory, 
via add_memory_driver_managed which requires a NUMA node ID (which driver can 
discover after your patch by calling phys_to_target_node).
Would the current upstream code for HDM work as described above, and if so, does it
rely on CDAT DSEMTS structure showing a specific value for EFI memory type? i.e would it
work if that field in DSEMTS was either EFI_CONVENTIONAL_MEMORY with EFI_MEMORY_SP, 
or EFI_RESERVED_MEMORY?

Thanks,
Vikram


> Alison
> 
> >
> > Thanks
> > >
> > > Signed-off-by: Alison Schofield <alison.schofield@xxxxxxxxx>
> > > ---
> > > Changes in v3:
> > > - Initial variable pxm (Ben)
> > >
> > > Changes in v2:
> > > - Use MAX_NUMNODES as max value when searching
> node_to_pxm_map()
> > > (0-day)
> > > - Add braces around single statement for loop (coding style)
> > > - Rename acpi_parse_cfmws() to acpi_cxl_cfmws_init to be more like
> other
> > >   functions in this file doing similar work.
> > > - Comments: remove superflous and state importance of the init order,
> > >   CFMWS after SRAT, (Ira, Dan)
> > > - Add prototype for numa_add_memblk() (0-day)
> > >
> > >  drivers/acpi/numa/srat.c | 70
> > > ++++++++++++++++++++++++++++++++++++++++
> > >  drivers/cxl/acpi.c       |  8 +++--
> > >  include/linux/acpi.h     |  1 +
> > >  3 files changed, 76 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c
> > > index
> > > b8795fc49097..daaaef58f1e6 100644
> > > --- a/drivers/acpi/numa/srat.c
> > > +++ b/drivers/acpi/numa/srat.c
> > > @@ -300,6 +300,67 @@ acpi_numa_memory_affinity_init(struct
> > > acpi_srat_mem_affinity *ma)  }  #endif /* defined(CONFIG_X86) ||
> > > defined
> > > (CONFIG_ARM64) */
> > >
> > > +static int __init acpi_cxl_cfmws_init(struct acpi_table_header
> > > +*acpi_cedt) {
> > > +       struct acpi_cedt_cfmws *cfmws;
> > > +       acpi_size len, cur = 0;
> > > +       int i, node, pxm = 0;
> > > +       void *cedt_subtable;
> > > +       u64 start, end;
> > > +
> > > +       /* Find the max PXM defined in the SRAT */
> > > +       for (i = 0; i < MAX_NUMNODES - 1; i++) {
> > > +               if (node_to_pxm_map[i] > pxm)
> > > +                       pxm = node_to_pxm_map[i];
> > > +       }
> > > +       /* Start assigning fake PXM values after the SRAT max PXM */
> > > +       pxm++;
> > > +
> > > +       len = acpi_cedt->length - sizeof(*acpi_cedt);
> > > +       cedt_subtable = acpi_cedt + 1;
> > > +
> > > +       while (cur < len) {
> > > +               struct acpi_cedt_header *c = cedt_subtable + cur;
> > > +
> > > +               if (c->type != ACPI_CEDT_TYPE_CFMWS)
> > > +                       goto next;
> > > +
> > > +               cfmws = cedt_subtable + cur;
> > > +               if (cfmws->header.length < sizeof(*cfmws)) {
> > > +                       pr_warn_once("CFMWS entry skipped:invalid length:%u\n",
> > > +                                    cfmws->header.length);
> > > +                       goto next;
> > > +               }
> > > +
> > > +               start = cfmws->base_hpa;
> > > +               end = cfmws->base_hpa + cfmws->window_size;
> > > +
> > > +               /*
> > > +                * Skip if the SRAT already described
> > > +                * the NUMA details for this HPA.
> > > +                */
> > > +               node = phys_to_target_node(start);
> > > +               if (node != NUMA_NO_NODE)
> > > +                       goto next;
> > > +
> > > +               node = acpi_map_pxm_to_node(pxm);
> > > +               if (node == NUMA_NO_NODE) {
> > > +                       pr_err("ACPI NUMA: Too many proximity domains.\n");
> > > +                       return -EINVAL;
> > > +               }
> > > +
> > > +               if (numa_add_memblk(node, start, end) < 0) {
> > > +                       /* CXL driver must handle the NUMA_NO_NODE case */
> > > +                       pr_warn("ACPI NUMA: Failed to add memblk for
> > > + CFMWS node
> > > %d [mem %#llx-%#llx]\n",
> > > +                               node, start, end);
> > > +               }
> > > +               pxm++;
> > > +next:
> > > +               cur += c->length;
> > > +       }
> > > +       return 0;
> > > +}
> > > +
> > >  static int __init acpi_parse_slit(struct acpi_table_header *table)  {
> > >         struct acpi_table_slit *slit = (struct acpi_table_slit
> > > *)table; @@ -478,6
> > > +539,15 @@ int __init acpi_numa_init(void)
> > >         /* SLIT: System Locality Information Table */
> > >         acpi_table_parse(ACPI_SIG_SLIT, acpi_parse_slit);
> > >
> > > +       /*
> > > +        * CEDT: CXL Fixed Memory Window Structures (CFMWS)
> > > +        * must be parsed after the SRAT. It creates NUMA
> > > +        * Nodes for CXL memory ranges not already defined
> > > +        * in the SRAT and it assigns PXMs after the max PXM
> > > +        * defined in the SRAT.
> > > +        */
> > > +       acpi_table_parse(ACPI_SIG_CEDT, acpi_cxl_cfmws_init);
> > > +
> > >         if (cnt < 0)
> > >                 return cnt;
> > >         else if (!parsed_numa_memblks) diff --git
> > > a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index
> > > 10120e4e0b9f..ccf73f0a59a7 100644
> > > --- a/drivers/cxl/acpi.c
> > > +++ b/drivers/cxl/acpi.c
> > > @@ -122,9 +122,11 @@ static void cxl_add_cfmws_decoders(struct
> > > device *dev,
> > >                                 cfmws->base_hpa, cfmws->base_hpa +
> > >                                 cfmws->window_size - 1);
> > >                 } else {
> > > -                       dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> > > -                               dev_name(&cxld->dev), cfmws->base_hpa,
> > > -                                cfmws->base_hpa + cfmws->window_size - 1);
> > > +                       dev_dbg(dev, "add: %s node: %d range %#llx-%#llx\n",
> > > +                               dev_name(&cxld->dev),
> > > +                               phys_to_target_node(cxld->range.start),
> > > +                               cfmws->base_hpa,
> > > +                               cfmws->base_hpa + cfmws->window_size
> > > + - 1);
> > >                 }
> > >                 cur += c->length;
> > >         }
> > > diff --git a/include/linux/acpi.h b/include/linux/acpi.h index
> > > 974d497a897d..f837fd715440 100644
> > > --- a/include/linux/acpi.h
> > > +++ b/include/linux/acpi.h
> > > @@ -426,6 +426,7 @@ extern bool acpi_osi_is_win8(void);  #ifdef
> > > CONFIG_ACPI_NUMA  int acpi_map_pxm_to_node(int pxm);  int
> > > acpi_get_node(acpi_handle handle);
> > > +int __init numa_add_memblk(int nodeid, u64 start, u64 end);
> > >
> > >  /**
> > >   * pxm_to_online_node - Map proximity ID to online node
> > >
> > > base-commit: 64570fbc14f8d7cb3fe3995f20e26bc25ce4b2cc
> > > prerequisite-patch-id: f09c67c6b3801f511521fd29c1cc01f5c5b1e9de
> > > --
> > > 2.31.1
> >




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux