Re: [PATCH bpf-next v2 0/5] execmem_alloc for BPF programs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Thomas,

On Wed, Dec 7, 2022 at 7:36 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
[...]
> > Survey of the 11 architecture specific module_alloc(). They basically do
> > the following magic:
> >
> > 1. Modify MODULES_VADDR and/or MODULES_END. There are multiple
> >   reasons behind this, some arch does this for KASLR, some other archs
> >   have different MODULES_[VADDR|END] for different processors (32b vs.
> >   64b for example), some archs use some module address space for other
> >   things (i.e. _exiprom on arm).
> >
> > Archs need 1: x86, arm64, arm, mips, ppc, riscv, s390, loongarch,
> > sparc
>
> All of this is pretty much a boot time init decision, right?

Yeah, all of these are boot time or compile time decisions.

>
> > 2. Use kasan_alloc_module_shadow()
> >
> > Archs need 2: x86, arm64, s390
>
> There is nothing really architecture specific, so that can be part of
> the core code, right?

Right, kasan_free_module_shadow() is called from vmalloc.c, so the
alloc one can do the same.

>
> > 3. A secondary module address space. There is a smaller preferred
> >   address space for modules. Once the preferred space runs out, allocate
> >   memory from a secondary address space.
[...]
>
> > 6. nios2 uses kmalloc() for modules. Based on the comment, this is
> >   probably only because it needs different MODULES_[VADDR|END].
>
> It's a horrible hack because they decided to have their layout:
>
>      VMALLOC_SPACE   0x80000000
>      KERNEL_SPACE    0xC0000000
>
> and they use kmalloc because CALL26/PCREL26 cannot reach from 0x80000000
> to 0xC0000000. That's true, but broken beyond repair.
>
> Making the layout:
>
>      VMALLOC_SPACE   0x80000000
>      MODULE_SPACE    0xBE000000         == 0xC0000000 - (1 << 24) (32M)
> or
>      MODULE_SPACE    0xBF000000         == 0xC0000000 - (1 << 24) (16M)
>      KERNEL_SPACE    0xC0000000
>
> would have been too obvious...

Yeah, I was thinking about something like this.

>
> > I think we can handle all these with a single module_alloc() and a few
> > module_arch_* functions().

[...]

>
> /**
>  * struct mod_alloc_type - Parameters for module allocation type
>  * @mapto_type:         The type to merge this type into, if different
>  *                      from the actual type which is configured here.
>  * @flags:              Properties
>  * @granularity:        The allocation granularity (PTE/PMD)
>  * @alignment:          The allocation alignment requirement
>  * @start:              Array of address space range start (inclusive)
>  * @end:                Array of address space range end (inclusive)
>  * @pgprot:             The page protection for this type
>  * @fill:               Function to fill allocated space. If NULL, use memcpy()
>  * @invalidate:         Function to invalidate allocated space. If NULL, use memset()
>  *
>  * If @granularity > @alignment the allocation can reuse free space in
>  * previously allocated pages. If they are the same, then fresh pages
>  * have to be allocated.
>  */
> struct mod_alloc_type {
>         unsigned int    mapto_type;
>         unsigned int    flags;
>         unsigned int    granularity;
>         unsigned int    alignment;
>         unsigned long   start[MOD_MAX_ADDR_SPACES];
>         unsigned long   end[MOD_MAX_ADDR_SPACES];
>         pgprot_t        pgprot;
>         void            (*fill)(void *dst, void *src, unsigned int size);
>         void            (*invalidate)(void *dst, unsigned int size);
> };

Yeah, this is a lot better than arch_ functions.

We probably want two more function pointers here:

int (*protect)(unsigned long addr, int numpages);
int (*unprotect)(unsigned long addr, int numpages);

These two functions will be NULL for archs that support text_poke;
while legacy archs use them for set_memory_[ro|x|rw|nx]. Then, I
think we can get rid of VM_FLUSH_RESET_PERMS.

[...]

Everything else makes perfect sense. Thanks!

I think I am ready to dive into the code and prepare the first RFC/PATCH.
Please let me know if there is anything we should discuss/clarify before that.

Best,
Song



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux