Hi Shivank, thanks a lot for the comments and findings, I've fixed build and plan to update the patch set soon. On 1/9/2024 9:46 AM, Garg, Shivank wrote: > Hi Artem, > > I hope this message finds you well. > I've encountered a compilation issue when KERNEL_REPLICATION is disabled in the config. > > ld: vmlinux.o: in function `alloc_insn_page': > /home/amd/linux_mainline/arch/x86/kernel/kprobes/core.c:425: undefined reference to `numa_set_memory_rox' > ld: vmlinux.o: in function `alloc_new_pack': > /home/amd/linux_mainline/kernel/bpf/core.c:873: undefined reference to `numa_set_memory_rox' > ld: vmlinux.o: in function `bpf_prog_pack_alloc': > /home/amd/linux_mainline/kernel/bpf/core.c:891: undefined reference to `numa_set_memory_rox' > ld: vmlinux.o: in function `bpf_trampoline_update': > /home/amd/linux_mainline/kernel/bpf/trampoline.c:447: undefined reference to `numa_set_memory_rox' > ld: vmlinux.o: in function `bpf_struct_ops_map_update_elem': > /home/amd/linux_mainline/kernel/bpf/bpf_struct_ops.c:515: undefined reference to `numa_set_memory_rox' > ld: vmlinux.o:/home/amd/linux_mainline/kernel/bpf/bpf_struct_ops.c:524: more undefined references to `numa_set_memory_rox' follow > > > After some investigation, I've put together a patch that resolves this compilation issues for me. > > --- a/arch/x86/mm/pat/set_memory.c > +++ b/arch/x86/mm/pat/set_memory.c > @@ -2268,6 +2268,15 @@ int numa_set_memory_nonglobal(unsigned long addr, int numpages) > > return ret; > } > + > +#else > + > +int numa_set_memory_rox(unsigned long addr, int numpages) > +{ > + return set_memory_rox(addr, numpages); > + > +} > + > #endif > > Additionally, I'm interested in evaluating the performance impact of this patchset on AMD processors. > Could you please point me the benchmarks that you have used in cover letter? > > Best Regards, > Shivank > Regarding the benchmarks, we used self-implemented test with system calls load for now. We used RedHawk Linux approach as a reference. The "An Overview of Kernel Text Page Replication in RedHawk™ Linux® 6.3" article was used. https://concurrent-rt.com/wp-content/uploads/2020/12/kernel-page-replication.pdf The test is very simple: All measured system calls have been invoked using syscall wrapper from glibc, e.g. #include <sys/syscall.h> /* Definition of SYS_* constants */ #include <unistd.h> long syscall(long number, ...); fork/1 Time measurements include only one time of invoking this system call. Measurements are made between entering and exiting the system call. fork/1024 The system call is invoked in a loop 1024 times. The time between entering a loop and exiting it was measured. mmap/munmap A set of 1024 pages (if PAGE_SIZE is not defined it is equal to 4096) was mapped using mmap syscall and unmapped using munmap one. Every page is mapped/unmapped per a loop iteration. mmap/lock The same as above, but in this case flag MAP_LOCKED was added. open/close The /dev/null pseudo-file was opened and closed in a loop 1024 times. It was opened and closed once per iteration. mount The pseudo-filesystem procFS was mounted to a temporary directory inside /tmp only one time. The time between entering and exiting the system call was measured. kill A signal handler for SIGUSR1 was setup. Signal was sent to a child process, which was created using fork glibc's wrapper. Time between sending and receiving SIGUSR1 signal was measured. Testing environment: Processor Intel(R) Xeon(R) CPU E5-2690 2 nodes with 12 CPU cores for each one. Best Regards, Artem