Hi all, Zi Yan and I would like to propose the topic: Enhancements to Page Migration with Multi-threading and Batch Offloading to DMA. Page migration is a critical operation in NUMA systems that can incur significant overheads, affecting memory management performance across various workloads. For example, copying folios between DRAM NUMA nodes can take ~25% of the total migration cost for migrating 256MB of data. Modern systems are equipped with powerful DMA engines for bulk data copying, GPUs, and high CPU core counts. Leveraging these hardware capabilities becomes essential for systems where frequent page promotion and demotion occur - from large-scale tiered-memory systems with CXL nodes to CPU-GPU coherent system with GPU memory exposed as NUMA nodes. Existing page migration performs sequential page copying, underutilizing modern CPU architectures and high-bandwidth memory subsystems. We have proposed and posted RFCs to enhance page migration through three key techniques: 1. Batching migration operations for bulk copying data [1] 2. Multi-threaded folio copying [2] 3. DMA offloading to hardware accelerators [1] By employing batching and multi-threaded folio copying, we are able to achieve significant improvements in page migration throughput for large pages. Discussion points: 1. Performance: a. Policy decision for DMA and CPU selection b. Platform-specific scheduling of folio-copy worker threads for better bandwidth utilization c. Using Non-temporal instructions for CPU-based memcpy d. Upscaling/downscaling worker threads based on migration size, CPU availability (system load), bandwidth saturation, etc. 2. Interface requirements with DMA hardware: a. Standardizing APIs for DMA drivers and support for different DMA drivers b. Enhancing DMA drivers for bulk copying (e.g., SDXi Engine) 3. Resources Accounting: a. CPU cgroups accounting and fairness [3] b. Who bears migration cost? - (Migration cost attribution) References: [1] https://lore.kernel.org/all/20240614221525.19170-1-shivankg@xxxxxxx [2] https://lore.kernel.org/all/20250103172419.4148674-1-ziy@xxxxxxxxxx [3] https://lore.kernel.org/all/CAHbLzkpoKP0fVZP5b10wdzAMDLWysDy7oH0qaUssiUXj80R6bw@xxxxxxxxxxxxxx Best Regards, Shivank