Re: [RFC v2 4/4] vmalloc_exec: share a huge page with kernel text

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Oct 11, 2022, at 1:40 PM, Edgecombe, Rick P <rick.p.edgecombe@xxxxxxxxx> wrote:
> 
> On Tue, 2022-10-11 at 16:25 +0000, Song Liu wrote:
>>> Maybe this is just me missing some vmalloc understanding, but this
>>> pointer to an all zero vm_struct seems weird too. Are there other
>>> vmap
>>> allocations like this? Which vmap APIs work with this and which
>>> don't?
>> 
>> There are two vmap trees at the moment: free_area_ tree and 
>> vmap_area_ tree. free_area_ tree uses vmap->subtree_max_size, while 
>> vmap_area_ tree contains vmap backed by vm_struct, and thus uses 
>> vmap->vm. 
>> 
>> This set add a new tree, free_text_area_. This tree is different to 
>> the other two, as it uses subtree_max_size, and it is also backed 
>> by vm_struct. To handle this requirement without growing vmap_struct,
>> we introduced all_text_vm to store the vm_struct for free_text_area_
>> tree. 
>> 
>> free_text_area_ tree is different to vmap_area_ tree. Each vmap in
>> vmap_area_ tree has its own vm_struct (1 to 1 mapping), while 
>> multiple vmap in free_text_area_ tree map to a single vm_struct.
>> 
>> Also, free_text_area_ handles granularity < PAGE_SIZE; while the
>> other two trees only work with PAGE_SIZE aligned memory. 
>> 
>> Does this answer your questions? 
> 
> I mean from the perspective of someone trying to use this without
> diving into the entire implementation.
> 
> The function is called vmalloc_exec() and is freed with vfree_exec().
> Makes sense. But with the other vmallocs_foo's (including previous
> vmalloc_exec() implementations) you can call find_vm_area(), etc on
> them. They show in "vmallocinfo" and generally behave similarly. That
> isn't true for these new allocations, right?

That's right. These operations are not supported (at least for now). 

> 
> Then you have code that operates on module text like:
> if (is_vmalloc_or_module_addr(addr))
> 	pfn = vmalloc_to_pfn(addr);
> 
> It looks like it would work (on x86 at least). Should it be expected
> to?
> 
> Especially after this patch, where there is memory that isn't even
> tracked by the original vmap_area trees, it is pretty much a separate
> allocator. So I think it might be nice to spell out which other vmalloc
> APIs work with these new functions since they are named "vmalloc".
> Maybe just say none of them do.

I guess it is fair to call this a separate allocator. Maybe 
vmalloc_exec is not the right name? I do think this is the best 
way to build an allocator with vmap tree logic. 

> 
> 
> Separate from that, I guess you are planning to make this limited to
> certain architectures? It might be better to put logic with assumptions
> about x86 boot time page table details inside arch/x86 somewhere.

Yes, the architecture need some text_poke mechanism to use this. 
On BPF side, x86_64 calls this directly from arch code (jit engine), 
so it is mostly covered. For modules, we need to handle this better. 

Thanks,
Song




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux