Le 20/02/2024 à 21:32, Maxwell Bland a écrit : > [Vous ne recevez pas souvent de courriers de mbland@xxxxxxxxxxxx. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ] > > Reworks ARM's virtual memory allocation infrastructure to support > dynamic enforcement of page middle directory PXNTable restrictions > rather than only during the initial memory mapping. Runtime enforcement > of this bit prevents write-then-execute attacks, where malicious code is > staged in vmalloc'd data regions, and later the page table is changed to > make this code executable. > > Previously the entire region from VMALLOC_START to VMALLOC_END was > vulnerable, but now the vulnerable region is restricted to the 2GB > reserved by module_alloc, a region which is generally read-only and more > difficult to inject staging code into, e.g., data must pass the BPF > verifier. These changes also set the stage for other systems, such as > KVM-level (EL2) changes to mark page tables immutable and code page > verification changes, forging a path toward complete mitigation of > kernel exploits on ARM. > > Implementing this required minimal changes to the generic vmalloc > interface in the kernel to allow architecture overrides of some vmalloc > wrapper functions, refactoring vmalloc calls to use a standard interface > in the generic kernel, and passing the address parameter already passed > into PTE allocation to the pte_allocate child function call. > > The new arm64 vmalloc wrapper functions ensure vmalloc data is not > allocated into the region reserved for module_alloc. arm64 BPF and > kprobe code also see a two-line-change ensuring their allocations abide > by the segmentation of code from data. Finally, arm64's pmd_populate > function is modified to set the PXNTable bit appropriately. On powerpc (book3s/32) we have more or less the same although it is not directly linked to PMDs: the virtual 4G address space is split in segments of 256M. On each segment there's a bit called NX to forbit execution. Vmalloc space is allocated in a segment with NX bit set while Module spare is allocated in a segment with NX bit unset. We never have to override vmalloc wrappers. All consumers of exec memory allocate it using module_alloc() while vmalloc() provides non-exec memory. For modules, all you have to do is select ARCH_WANTS_MODULES_DATA_IN_VMALLOC and module data will be allocated using vmalloc() hence non-exec memory in our case. Christophe