Re: [PATCH RFC 04/12] x86: add support of memory protection for NUMA replicas

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Shivank,
thanks a lot for the comments and findings, I've fixed build and plan to update the patch set soon.

On 1/9/2024 9:46 AM, Garg, Shivank wrote:
> Hi Artem,
>
> I hope this message finds you well.
> I've encountered a compilation issue when KERNEL_REPLICATION is disabled in the config.
>
> ld: vmlinux.o: in function `alloc_insn_page':
> /home/amd/linux_mainline/arch/x86/kernel/kprobes/core.c:425: undefined reference to `numa_set_memory_rox'
> ld: vmlinux.o: in function `alloc_new_pack':
> /home/amd/linux_mainline/kernel/bpf/core.c:873: undefined reference to `numa_set_memory_rox'
> ld: vmlinux.o: in function `bpf_prog_pack_alloc':
> /home/amd/linux_mainline/kernel/bpf/core.c:891: undefined reference to `numa_set_memory_rox'
> ld: vmlinux.o: in function `bpf_trampoline_update':
> /home/amd/linux_mainline/kernel/bpf/trampoline.c:447: undefined reference to `numa_set_memory_rox'
> ld: vmlinux.o: in function `bpf_struct_ops_map_update_elem':
> /home/amd/linux_mainline/kernel/bpf/bpf_struct_ops.c:515: undefined reference to `numa_set_memory_rox'
> ld: vmlinux.o:/home/amd/linux_mainline/kernel/bpf/bpf_struct_ops.c:524: more undefined references to `numa_set_memory_rox' follow
>
>
> After some investigation, I've put together a patch that resolves this compilation issues for me.
>
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -2268,6 +2268,15 @@ int numa_set_memory_nonglobal(unsigned long addr, int numpages)
>
>         return ret;
>  }
> +
> +#else
> +
> +int numa_set_memory_rox(unsigned long addr, int numpages)
> +{
> +       return set_memory_rox(addr, numpages);
> +
> +}
> +
>  #endif
>
> Additionally, I'm interested in evaluating the performance impact of this patchset on AMD processors.
> Could you please point me the benchmarks that you have used in cover letter?
>
> Best Regards,
> Shivank
>
Regarding the benchmarks, we used self-implemented test with system calls load for now.
We used RedHawk Linux approach as a reference.

The "An Overview of Kernel Text Page Replication in RedHawk™ Linux® 6.3" article was used.
https://concurrent-rt.com/wp-content/uploads/2020/12/kernel-page-replication.pdf

The test is very simple:
All measured system calls have been invoked using syscall wrapper from glibc, e.g.

#include <sys/syscall.h>      /* Definition of SYS_* constants */
#include <unistd.h>
 
long syscall(long number, ...);

fork/1
    Time measurements include only one time of invoking this system call. Measurements are made between entering
    and exiting the system call.
fork/1024
    The system call is invoked in  a loop 1024 times. The time between entering a loop and exiting it was measured.
mmap/munmap
    A set of 1024 pages (if PAGE_SIZE is not defined it is equal to 4096) was mapped using mmap syscall
    and unmapped using munmap one. Every page is mapped/unmapped per a loop iteration.
mmap/lock
    The same as above, but in this case flag MAP_LOCKED was added.
open/close
    The /dev/null pseudo-file was opened and closed in a loop 1024 times. It was opened and closed once per iteration.
mount
    The pseudo-filesystem procFS was mounted to a temporary directory inside /tmp only one time.
    The time between entering and exiting the system call was measured.
kill
    A signal handler for SIGUSR1 was setup. Signal was sent to a child process, which was created using fork glibc's wrapper.
    Time between sending and receiving SIGUSR1 signal was measured.

Testing environment:
    Processor Intel(R) Xeon(R) CPU E5-2690
    2 nodes with 12 CPU cores for each one.

Best Regards,
Artem





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux