Hi all, Recently we are working on implementing CRIU [1] for QEMU based on Steven's work [2]. It will use memfd to allocate guest memory in order to restore (inherit) it in the new QEMU process. However, memfd will allocate a new page for reading while anonymous memory will map to zero page for reading. For QEMU, memfd may cause that all memory are allocated during the migration because QEMU will read all pages in migration. It may lead to OOM if over-committed memory is enabled, which is usually enabled in public cloud. In this patch I try to add support mapping to zero pages on reading memfd. On reading, memfd will map to zero page instead of allocating a new page. Then COW it when a write occurs. For now it's just a demo for discussion. There are lots of work to do, e.g.: 1. don't support THP; 2. don't support shared reading and writing, only for inherit. For example: task1 | task2 1) read from addr | | 2) write to addr 3) read from addr again | then 3) will read 0 instead of the data task2 writed in 2). Would something similar be welcome in the Linux? Thanks, Peng [1] https://criu.org/Checkpoint/Restore [2] https://patchwork.kernel.org/project/qemu-devel/cover/1628286241-217457-1-git-send-email-steven.sistare@xxxxxxxxxx/ Peng Liang (1): memfd: Support mapping to zero page on reading memfd include/linux/fs.h | 2 ++ include/uapi/linux/memfd.h | 1 + mm/memfd.c | 8 ++++++-- mm/memory.c | 37 ++++++++++++++++++++++++++++++++++--- mm/shmem.c | 10 ++++++++-- 5 files changed, 51 insertions(+), 7 deletions(-) -- 2.33.1