On Thu, 14 Jul 2022 17:00:59 -0700 Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > Recall that CXL capable address ranges, on ACPI platforms, are published > in the CEDT.CFMWS (CXL Early Discovery Table: CXL Fixed Memory Window > Structures). These windows represent both the actively mapped capacity > and the potential address space that can be dynamically assigned to a > new CXL decode configuration (region / interleave-set). > > CXL endpoints like DDR DIMMs can be mapped at any physical address > including 0 and legacy ranges. > > There is an expectation and requirement that the /proc/iomem interface > and the iomem_resource tree in the kernel reflect the full set of > platform address ranges. I.e. that every address range that platform > firmware and bus drivers enumerate be reflected as an iomem_resource > entry. The hard requirement to do this for CXL arises from the fact that > facilities like CONFIG_DEVICE_PRIVATE expect to be able to treat empty > iomem_resource ranges as free for software to use as proxy address > space. Without CXL publishing its potential address ranges in > iomem_resource, the CONFIG_DEVICE_PRIVATE mechanism may inadvertently > steal capacity reserved for runtime provisioning of new CXL regions. > > So, iomem_resource needs to know about both active and potential CXL > resource ranges. The active CXL resources might already be reflected in > iomem_resource as "System RAM". insert_resource_expand_to_fit() handles > re-parenting "System RAM" underneath a CXL window. > > The "_expand_to_fit()" behavior handles cases where a CXL window is not > a strict superset of an existing entry in the iomem_resource tree. The > "_expand_to_fit()" behavior is acceptable from the perspective of > resource allocation. The expansion happens because a conflicting > resource range is already populated, which means the resource boundary > expansion does not result in any additional free CXL address space being > made available. CXL address space allocation is always bounded by the > orginal unexpanded address range. > > However, the potential for expansion does mean that something like > walk_iomem_res_desc(IORES_DESC_CXL...) can only return fuzzy answers on > corner case platforms that cause the resource tree to expand a CXL > window resource over a range that is not decoded by CXL. This would be > an odd platform configuration, but if it becomes a problem in practice > the CXL subsytem could just publish an API that returns definitive > answers. > > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: David Hildenbrand <david@xxxxxxxxxx> > Cc: Jason Gunthorpe <jgg@xxxxxxxxxx> > Cc: Tony Luck <tony.luck@xxxxxxxxx> > Cc: Christoph Hellwig <hch@xxxxxx> > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> I wish this was a bit simpler, (particularly the complex relationship between the places res is added vs single cleanup location) but having stared at it for a while I can't figure out a way that works.. *defeated* :)