On Thu, April 18, 2024 at 3:55 AM, Uladzislau Rezki wrote: > On Tue, Apr 02, 2024 at 03:15:01PM -0500, Maxwell Bland wrote: > > +extern void insert_vmap_area_augment(struct vmap_area *va, struct rb_node > > +extern int va_clip(struct rb_root *root, struct list_head *head, +extern > > struct vmap_area *__find_vmap_area(unsigned long addr, > To me it looks like you want to make internal functions as public for > everyone which is not good, imho. First, thank you for the feedback. I tussled with some of these ideas too while writing. I will clarify some motivations below and then propose some alternatives based upon your review. > arch_skip_va() injections into the search algorithm sounds like a hack and > might lead(if i do not miss something, need to check closer) to alloc > failures when we go toward a reserved VA but we are not allowed to allocate > from. This is a good insight into the architectural intention here. As is clear, the underlying goal of this patch is to provide a method for architectures to enforce their own pseudo-reserved vmalloc regions dynamically. This considered, the highlighted potential failures would technically be legitimate with the caveat of making architectures who implement the interface responsible for maintaining only correct and appropriate reservations? If so, then the path diverges conditioned on whether we believe that caveat is reasonable. I am on the fence about whether freedom is good here, so I think it is reasonable to disallow this freedom, see below. > Why do not you allocate just using a specific range from MODULES_ASLR_START > till VMALLOC_END? Mark Rutland has indicated that he does not support a large free region size reduction in favor of ensuring pages are not interleaved. That is, this was my initial approach, but it was deemed unfit. Strict partitioning creates a trade-off between region size and ASLR randomization. To clarify a secondary point, in case this question was more general: allowing interleaving between VMALLOC_START to VMALLOC_END and MODULES_ASLR_START to MODULES_ASLR_END regions breaks a key usecase of being able to enforce new PMD-level and coarse-grained protections (e.g. PXNTable) dynamically. In case the question is more of a "why are you submitting this in the first place": non-interleaving simplifies code focused on preventing malicious page table updates since we do not need to track all updates of PTE level descriptors. Verifying individual PTE updates comes at a high (performance, complexity) cost and happens to lead to hardware-level privilege-checking race conditions on certain very popular arm64 chipsets. OK, preamble out of the way: (1) Would it be OK to potentially export a more generic version of the functions written in arch/arm64/kernel/vmalloc.c for https://lore.kernel.org/all/20240416122254.868007168-3-mbland@xxxxxxxxxxxx/ That is, move a version of these functions to the main vmalloc.c? This way these functions are still owned by the right part of the kernel. Or (2) the exported functions could be duplicated, effectively, into architecture-specific code, a sort of "all in" to the caveat mentioned above of making the architectures responsible for maintaining a reserved code region if they choose to implement the interface. (3) Potentially a different approach that does not involve skipping the allocation of "bad" VA's but instead dynamically restructures the tree, potentially just creating two trees, one for data and one for code, is in mind. Thanks and Regards, Maxwell Bland