v4: - Reworked PCIe link path latency calculation - 0-day fixes - Removed unused qos_list from cxl_memdev and its stray usages v3: - Please see specific patches for log entries addressing comments from v2. - Refactor cxl_port_probe() additions. (Alison) - Convert to use 'struct node_hmem_attrs' - Refactor to use common code for genport target allocation. - Add third array entry for target hmem_attrs to store genport locality data. - Go back to per partition QTG ID. (Dan) v2: - Please see specific patches for log entries addressing comments from v1. - Removed ACPICA code usages. - Removed PCI subsystem helpers for latency and bandwidth. - Add CXL switch CDAT parsing support (SSLBIS) - Add generic port SRAT+HMAT support (ACPI) - Export a single QTG ID via sysfs per memory device (Dan) - Provide rest of DSMAS range info in debugfs (Dan) Hi Rafael, please review the relevant patches to ACPI: 13/23-16/23. Thank you! If they are ok, Dan can take them through the CXL tree for upstream merging. 13 - Adds enum for memory_target hmem_attrs in order to enumerate the array index. 14 - Add generic port target allocation for SRAT parsing during HMAT init in order to extract and store the device handle. 15 - Add a new index for hmem_attrs and save the locality data to the new hmem_attrs array element with generic port data. The old array elements are preserved for later when we want to store the calculated CXL memory target locality data. 16 - Add ACPI helper function to retrieve the locality data for generic port. Used by CXL driver to calculate the full locality data for the CXL memory device. This series adds the retrieval of QoS Throttling Group (QTG) IDs for the CXL Fixed Memory Window Structure (CFMWS) and the CXL memory device. It provides the QTG IDs to user space to provide some guidance with putting the proper DPA range under the appropriate CFMWS window for a hot-plugged CXL memory device. The CFMWS structure contains a QTG ID that is associated with the memory window that the structure exports. On Linux, the CFMWS is represented as a CXL root decoder. The QTG ID will be attached to the CXL root decoder and exported as a sysfs attribute (qtg_id). The QTG ID for a device is retrieved via sending a _DSM method to the ACPI0017 device. The _DSM expects an input package of 4 DWORDS that contains the read latency, write latency, read bandwidth, and write banwidth. These are the caluclated numbers for the path between the CXL device and the CXL host bridge (HB). The QTG ID is also exported as a sysfs attribute under the mem device memory partition type: /sys/bus/cxl/devices/memX/ram/qtg_id /sys/bus/cxl/devices/memX/pmem/qtg_id Only the first QTG ID is exported. The rest of the information can be found under /sys/kernel/debug/cxl/memX/qtgmap where all the DPA ranges with the correlated QTG ID are displayed. Each DSMAS from the device CDAT will provide a DPA range. The latency numbers are the aggregated latencies for the path between the CXL device and the CPU. If a CXL device is directly attached to the CXL HB, the latency would be the aggregated latencies from the device Coherent Device Attribute Table (CDAT), the caluclated PCIe link latency between the device and the HB, and the generic port data from ACPI SRAT+HMAT. The bandwidth in this configuration would be the minimum between the CDAT bandwidth number, link bandwidth between the device and the HB, and the bandwidth data from the generic port data via ACPI SRAT+HMAT. If a configuration has a switch in between then the latency would be the aggregated latencies from the device CDAT, the link latency between device and switch, the latency from the switch CDAT, the link latency between switch and the HB, and the generic port latency between the CPU and the CXL HB. The bandwidth calculation would be the min of device CDAT bandwidth, link bandwith between device and switch, switch CDAT bandwidth, the link bandwidth between switch and HB, and the generic port bandwidth There can be 0 or more switches between the CXL device and the CXL HB. There are detailed examples on calculating bandwidth and latency in the CXL Memory Device Software Guide [4]. The CDAT provides Device Scoped Memory Affinity Structures (DSMAS) that contains the Device Physical Address (DPA) range and the related Device Scoped Latency and Bandwidth Informat Stuctures (DSLBIS). Each DSLBIS provides a latency or bandwidth entry that is tied to a DSMAS entry via a per DSMAS unique DSMAD handle. This series is based on Lukas's latest DOE changes [5]. Kernel branch with all the code can be retrieved here [6] for convenience. Test setup is done with runqemu genport support branch [6]. The setup provides 2 CXL HBs with one HB having a CXL switch underneath. It also provides generic port support detailed below. A hacked up qemu branch is used to support generic port SRAT and HMAT [7]. To create the appropriate HMAT entries for generic port, the following qemu paramters must be added: -object genport,id=$X -numa node,genport=genport$X,nodeid=$Y,initiator=$Z -numa hmat-lb,initiator=$Z,target=$X,hierarchy=memory,data-type=access-latency,latency=$latency -numa hmat-lb,initiator=$Z,target=$X,hierarchy=memory,data-type=access-bandwidth,bandwidth=$bandwidthM for ((i = 0; i < total_nodes; i++)); do for ((j = 0; j < cxl_hbs; j++ )); do # 2 CXL HBs -numa dist,src=$i,dst=$X,val=$dist done done See the genport run_qemu branch for full details. [1]: https://www.computeexpresslink.org/download-the-specification [2]: https://uefi.org/sites/default/files/resources/Coherent%20Device%20Attribute%20Table_1.01.pdf [3]: https://uefi.org/sites/default/files/resources/ACPI_Spec_6_5_Aug29.pdf [4]: https://cdrdv2-public.intel.com/643805/643805_CXL%20Memory%20Device%20SW%20Guide_Rev1p0.pdf [5]: https://lore.kernel.org/linux-cxl/20230313195530.GA1532686@bhelgaas/T/#t [6]: https://git.kernel.org/pub/scm/linux/kernel/git/djiang/linux.git/log/?h=cxl-qtg [7]: https://github.com/pmem/run_qemu/tree/djiang/genport [8]: https://github.com/davejiang/qemu/tree/genport --- Dave Jiang (23): cxl: Export QTG ids from CFMWS to sysfs cxl: Add checksum verification to CDAT from CXL cxl: Add support for reading CXL switch CDAT table cxl: Add common helpers for cdat parsing cxl: Add callback to parse the DSMAS subtables from CDAT cxl: Add callback to parse the DSLBIS subtable from CDAT cxl: Add callback to parse the SSLBIS subtable from CDAT cxl: Add support for _DSM Function for retrieving QTG ID cxl: Add helper function to retrieve ACPI handle of CXL root device cxl: Add helpers to calculate pci latency for the CXL device cxl: Add helper function that calculates QoS values for switches cxl: Add helper function that calculate QoS values for PCI path ACPI: NUMA: Create enum for memory_target hmem_attrs indexing ACPI: NUMA: Add genport target allocation to the HMAT parsing ACPI: NUMA: Add setting of generic port locality attributes ACPI: NUMA: Add helper function to retrieve the performance attributes cxl: Add helper function to retrieve generic port QoS cxl: Add latency and bandwidth calculations for the CXL path cxl: Wait Memory_Info_Valid before access memory related info cxl: Move identify and partition query from pci probe to port probe cxl: Store QTG IDs and related info to the CXL memory device context cxl: Export sysfs attributes for memory device QTG ID cxl/mem: Add debugfs output for QTG related data Documentation/ABI/testing/debugfs-cxl | 11 + Documentation/ABI/testing/sysfs-bus-cxl | 31 +++ drivers/acpi/numa/hmat.c | 138 ++++++++++-- drivers/cxl/acpi.c | 3 + drivers/cxl/core/Makefile | 2 + drivers/cxl/core/acpi.c | 180 ++++++++++++++++ drivers/cxl/core/cdat.c | 270 ++++++++++++++++++++++++ drivers/cxl/core/mbox.c | 3 + drivers/cxl/core/memdev.c | 26 +++ drivers/cxl/core/pci.c | 187 ++++++++++++++-- drivers/cxl/core/port.c | 183 ++++++++++++++++ drivers/cxl/cxl.h | 27 +++ drivers/cxl/cxlmem.h | 21 ++ drivers/cxl/cxlpci.h | 117 ++++++++++ drivers/cxl/mem.c | 17 ++ drivers/cxl/pci.c | 21 -- drivers/cxl/port.c | 155 +++++++++++++- include/acpi/actbl3.h | 2 + include/linux/acpi.h | 6 + 19 files changed, 1348 insertions(+), 52 deletions(-) create mode 100644 Documentation/ABI/testing/debugfs-cxl create mode 100644 drivers/cxl/core/acpi.c create mode 100644 drivers/cxl/core/cdat.c --