Hi all Overview: During testing the CXL memory hotremove, we noticed that `daxctl offline-memory dax0.0` would get stuck forever sometimes. daxctl offline-memory dax0.0 will write "offline" to /sys/devices/system/memory/memoryNNN/state. Workaround: When it happens, we can type Ctrl-C to abort it and then retry again. Then the CXL memory is able to offline successfully. Where the kernel gets stuck: After digging into the kernel, we found that when the issue occurs, the kernel is stuck in the outer loop of offline_pages(). Below is a piece of the highlighted offline_pages(): ``` int __ref offline_pages() { do { // outer loop pfn = start_pfn; do { ret = scan_movable_pages(pfn, end_pfn, &pfn); // It returns -ENOENT if (!ret) do_migrate_range(pfn, end_pfn); // Not reach here } while (!ret); ret = test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE); } while (ret); // ret is -EBUSY } ``` In this case, we dumped the first page that cannot be isolated (see dump_page below), it's content does not change in each iteration.: ``` Jun 28 15:29:26 linux kernel: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7980dd Jun 28 15:29:26 linux kernel: flags: 0x9fffffc0000000(node=2|zone=3|lastcpupid=0x1fffff) Jun 28 15:29:26 linux kernel: raw: 009fffffc0000000 ffffdfbd9e603788 ffffd4f0ffd97ef0 0000000000000000 Jun 28 15:29:26 linux kernel: raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 Jun 28 15:29:26 linux kernel: page dumped because: trouble page... ``` Every time the issue occurs, the content of the page structure is similar. Questions: Q1. Is this behavior expected? At least for an OS administrator, it should return promptly (success or failure) instead of hanging indefinitely. Q2. Regarding the offline_pages() function, encountering such a page indeed causes an endless loop. Shouldn't another part of the kernel timely changed the state of this page? When I use the workaround mentioned above (Ctrl-C and try offline again), I find that the page state changes (see dump_page below): ``` Jun 28 15:33:12 linux kernel: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7980dd Jun 28 15:33:12 linux kernel: flags: 0x9fffffc0000000(node=2|zone=3|lastcpupid=0x1fffff) Jun 28 15:33:12 linux kernel: raw: 009fffffc0000000 dead000000000100 dead000000000122 0000000000000000 Jun 28 15:33:12 linux kernel: raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 Jun 28 15:33:12 linux kernel: page dumped because: previous trouble page ``` What our test does: We have a CXL memory device, which is configured as kmem and online into the MOVABLE zone as NUMA node2. We run two processes, consume-memory and offline-memory, in parallel, see the pseudo code below: ``` main() { if (fork() == 0) numactl -m 2 ./consume-memory else { daxctl offline-memory dax0.0 wait() } } ``` Attached is the process information (when it gets stuck): ``` root 25716 0.0 0.0 2460 1408 pts/0 S+ 15:28 0:00 ./main root 25719 0.0 0.0 0 0 pts/0 Z+ 15:28 0:00 [consume-memory] <defunct> root 25720 98.6 0.0 9476 3740 pts/0 R+ 15:28 0:26 daxctl offline-memory /dev/dax0.0 ``` Feel free to let me know if you need more details. Thank you for your attention to this issue. Looking forward to your insights. Thanks Zhijian