Jonathan Cameron wrote: > On Thu, 23 Jun 2022 21:19:44 -0700 > Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > > CXL regions (interleave sets) are made up of a set of memory devices > > where each device maps a portion of the interleave with one of its > > decoders (see CXL 2.0 8.2.5.12 CXL HDM Decoder Capability Structure). > > As endpoint decoders are identified by a provisioning tool they can be > > added to a region provided the region interleave properties are set > > (way, granularity, HPA) and DPA has been assigned to the decoder. > > > > The attach event triggers several validation checks, for example: > > - is the DPA sized appropriately for the region > > - is the decoder reachable via the host-bridges identified by the > > region's root decoder > > - is the device already active in a different region position slot > > - are there already regions with a higher HPA active on a given port > > (per CXL 2.0 8.2.5.12.20 Committing Decoder Programming) > > > > ...and the attach event affords an opportunity to collect data and > > resources relevant to later programming the target lists in switch > > decoders, for example: > > - allocate a decoder at each cxl_port in the decode chain > > - for a given switch port, how many the region's endpoints are hosted > > through the port > > - how many unique targets (next hops) does a port need to map to reach > > those endpoints > > > > The act of reconciling this information and deploying it to the decoder > > configuration is saved for a follow-on patch. > Hi Dam, > n > Only managed to grab a few mins today to debug that crash.. So I know > the immediate cause but not yet why we got to that state. > > Test case (happened to be one I had open) is 2x HB, 2x RP on each, > direct connected type 3s on all ports. > > Manual test script is: > > insmod modules/5.19.0-rc3+/kernel/drivers/cxl/core/cxl_core.ko > insmod modules/5.19.0-rc3+/kernel/drivers/cxl/cxl_acpi.ko > insmod modules/5.19.0-rc3+/kernel/drivers/cxl/cxl_port.ko > insmod modules/5.19.0-rc3+/kernel/drivers/cxl/cxl_pci.ko > insmod modules/5.19.0-rc3+/kernel/drivers/cxl/cxl_mem.ko > insmod modules/5.19.0-rc3+/kernel/drivers/cxl/cxl_pmem.ko > > cd /sys/bus/cxl/devices/decoder0.0/ > cat create_pmem_region > echo region0 > create_pmem_region > > cd region0/ > echo 4 > interleave_ways > echo $((256 << 22)) > size > echo 6a6b9b22-e0d4-11ec-9d64-0242ac120002 > uuid > ls -lh /sys/bus/cxl/devices/endpoint?/upo* > > # Then figure out the order hopefully write the correct targets > echo decoder5.0 > target0 Oh, something simple in the end. Just need to check that DPA is assigned before region attach. I folded the following into patch 40: diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 0b5acabcc541..d52c97e941fe 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -765,10 +765,17 @@ static int cxl_region_attach(struct cxl_region *cxlr, return -ENXIO; } + if (!cxled->dpa_res) { + dev_dbg(&cxlr->dev, "%s:%s: missing DPA allocation.\n", + dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev)); + return -ENXIO; + } + if (resource_size(cxled->dpa_res) * p->interleave_ways != resource_size(p->res)) { dev_dbg(&cxlr->dev, - "decoder-size-%#llx * ways-%d != region-size-%#llx\n", + "%s:%s: decoder-size-%#llx * ways-%d != region-size-%#llx\n", + dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), (u64)resource_size(cxled->dpa_res), p->interleave_ways, (u64)resource_size(p->res)); return -EINVAL;