This is a note to let you know that I've just added the patch titled cxl/region: Do not try to cleanup after cxl_region_setup_targets() fails to the 6.5-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: cxl-region-do-not-try-to-cleanup-after-cxl_region_setup_targets-fails.patch and it can be found in the queue-6.5 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 0718588c7aaa7a1510b4de972370535b61dddd0d Mon Sep 17 00:00:00 2001 From: Jim Harris <jim.harris@xxxxxxxxxxx> Date: Wed, 11 Oct 2023 14:51:31 +0000 Subject: cxl/region: Do not try to cleanup after cxl_region_setup_targets() fails From: Jim Harris <jim.harris@xxxxxxxxxxx> commit 0718588c7aaa7a1510b4de972370535b61dddd0d upstream. Commit 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") tried to avoid 'eiw' initialization errors when ->nr_targets exceeded 16, by just decrementing ->nr_targets when cxl_region_setup_targets() failed. Commit 86987c766276 ("cxl/region: Cleanup target list on attach error") extended that cleanup to also clear cxled->pos and p->targets[pos]. The initialization error was incidentally fixed separately by: Commit 8d4285425714 ("cxl/region: Fix port setup uninitialized variable warnings") which was merged a few days after 5e42bcbc3fef. But now the original cleanup when cxl_region_setup_targets() fails prevents endpoint and switch decoder resources from being reused: 1) the cleanup does not set the decoder's region to NULL, which results in future dpa_size_store() calls returning -EBUSY 2) the decoder is not properly freed, which results in future commit errors associated with the upstream switch Now that the initialization errors were fixed separately, the proper cleanup for this case is to just return immediately. Then the resources associated with this target get cleanup up as normal when the failed region is deleted. The ->nr_targets decrement in the error case also helped prevent a p->targets[] array overflow, so add a new check to prevent against that overflow. Tested by trying to create an invalid region for a 2 switch * 2 endpoint topology, and then following up with creating a valid region. Fixes: 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") Cc: <stable@xxxxxxxxxxxxxxx> Signed-off-by: Jim Harris <jim.harris@xxxxxxxxxxx> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> Acked-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx> Reviewed-by: Dave Jiang <dave.jiang@xxxxxxxxx> Link: https://lore.kernel.org/r/169703589120.1202031.14696100866518083806.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/cxl/core/region.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1676,6 +1676,12 @@ static int cxl_region_attach(struct cxl_ return -ENXIO; } + if (p->nr_targets >= p->interleave_ways) { + dev_dbg(&cxlr->dev, "region already has %d endpoints\n", + p->nr_targets); + return -EINVAL; + } + ep_port = cxled_to_port(cxled); root_port = cxlrd_to_port(cxlrd); dport = cxl_find_dport_by_dev(root_port, ep_port->host_bridge); @@ -1768,7 +1774,7 @@ static int cxl_region_attach(struct cxl_ if (p->nr_targets == p->interleave_ways) { rc = cxl_region_setup_targets(cxlr); if (rc) - goto err_decrement; + return rc; p->state = CXL_CONFIG_ACTIVE; } @@ -1800,12 +1806,6 @@ static int cxl_region_attach(struct cxl_ } return 0; - -err_decrement: - p->nr_targets--; - cxled->pos = -1; - p->targets[pos] = NULL; - return rc; } static int cxl_region_detach(struct cxl_endpoint_decoder *cxled) Patches currently in stable-queue which might be from jim.harris@xxxxxxxxxxx are queue-6.5/cxl-region-do-not-try-to-cleanup-after-cxl_region_setup_targets-fails.patch