On 16.08.24 07:07, Dan Williams wrote:
[ add David ]
Andrew Morton wrote:
On Fri, 16 Aug 2024 10:07:23 +0800 Huang Ying <ying.huang@xxxxxxxxx> wrote:
On a system with CXL memory installed, the resource tree (/proc/iomem)
related to CXL memory looks like something as follows.
490000000-50fffffff : CXL Window 0
490000000-50fffffff : region0
490000000-50fffffff : dax0.0
490000000-50fffffff : System RAM (kmem)
When the following command line is run to try writing some memory in
CXL memory range,
$ dd if=data of=/dev/mem bs=1k seek=19136512 count=1
dd: error writing '/dev/mem': Bad address
1+0 records in
0+0 records out
0 bytes copied, 0.0283507 s, 0.0 kB/s
the command fails as expected. However, the error code is wrong. It
should be "Operation not permitted" instead of "Bad address". And,
the following warning is reported in kernel log.
ioremap on RAM at 0x0000000490000000 - 0x0000000490000fff
WARNING: CPU: 2 PID: 416 at arch/x86/mm/ioremap.c:216 __ioremap_caller.constprop.0+0x131/0x35d
But we should definitely fix the warning.
...
Presumably we want to fix earlier kernels? If so, are you able to
identify a suitable Fixes: target? Possibly 974854ab0728 ("cxl/acpi:
Track CXL resources in iomem_resource")?
At least that commit, but I think this problem potentially goes back
farther to:
c221c0b0308f device-dax: "Hotplug" persistent memory for use like normal RAM
...because that started the era of "System RAM" as a non-top-level
resource.
David did a bunch of work to fix this back in:
97f61c8f44ec kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
..but the fallout in region_intersects() was missed.
Sounds reasonable.
For virtio-mem we set IORESOURCE_SYSTEM_RAM|IORESOURCE_EXCLUSIVE on our
(highest) parent resource (to make any /dev/mem access attempts of that
memory fail). So the problem is likely specific to other
add_memory_driver_managed() users.
I have a faint recollection that at some point we had code that would
set IORESOURCE_SYSTEM_RAM on parent resources up the tree, but either my
memory is wrong or that code was ripped out long ago.
Fix idea is reasonable: check if anything in that range is
IORESOURCE_SYSTEM_RAM.
--
Cheers,
David / dhildenb