> On Jan 21, 2022, at 4:41 PM, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Fri, Jan 21, 2022 at 4:23 PM Song Liu <songliubraving@xxxxxx> wrote: >> >> >> >>> On Jan 21, 2022, at 3:55 PM, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: >>> >>> On Fri, Jan 21, 2022 at 11:49 AM Song Liu <song@xxxxxxxxxx> wrote: >>>> >>>> +static struct bpf_binary_header * >>>> +__bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr, >>>> + unsigned int alignment, >>>> + bpf_jit_fill_hole_t bpf_fill_ill_insns, >>>> + u32 round_up_to) >>>> +{ >>>> + struct bpf_binary_header *hdr; >>>> + u32 size, hole, start; >>>> + >>>> + WARN_ON_ONCE(!is_power_of_2(alignment) || >>>> + alignment > BPF_IMAGE_ALIGNMENT); >>>> + >>>> + /* Most of BPF filters are really small, but if some of them >>>> + * fill a page, allow at least 128 extra bytes to insert a >>>> + * random section of illegal instructions. >>>> + */ >>>> + size = round_up(proglen + sizeof(*hdr) + 128, round_up_to); >>>> + >>>> + if (bpf_jit_charge_modmem(size)) >>>> + return NULL; >>>> + hdr = bpf_jit_alloc_exec(size); >>>> + if (!hdr) { >>>> + bpf_jit_uncharge_modmem(size); >>>> + return NULL; >>>> + } >>>> + >>>> + /* Fill space with illegal/arch-dep instructions. */ >>>> + bpf_fill_ill_insns(hdr, size); >>>> + >>>> + hdr->size = size; >>>> + hole = min_t(unsigned int, size - (proglen + sizeof(*hdr)), >>>> + PAGE_SIZE - sizeof(*hdr)); >>> >>> It probably should be 'round_up_to' instead of PAGE_SIZE ? >> >> Actually, some of these change is not longer needed after the following >> change in v6: >> >> 4. Change fall back round_up_to in bpf_jit_binary_alloc_pack() from >> BPF_PROG_MAX_PACK_PROG_SIZE to PAGE_SIZE. >> >> My initial thought (last year) was if we allocate more than 2MB (either >> 2.1MB or 3.9MB), we round up to 4MB to save page table entries. >> However, when I revisited this earlier today, I thought we should still >> round up to PAGE_SIZE to save memory >> >> Right now, I am not sure which way is better. What do you think? If we >> round up to PAGE_SIZE, we don't need split out __bpf_jit_binary_alloc(). > > The less code duplication the better. Got it. Will go with PAGE_SIZE. [...] >>>> + >>>> if (bpf_jit_charge_modmem(size)) >>>> return NULL; >>>> - hdr = bpf_jit_alloc_exec(size); >>>> + hdr = bpf_prog_pack_alloc(size); >>>> if (!hdr) { >>>> bpf_jit_uncharge_modmem(size); >>>> return NULL; >>>> @@ -888,9 +1052,8 @@ bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr, >>>> /* Fill space with illegal/arch-dep instructions. */ >>>> bpf_fill_ill_insns(hdr, size); >>>> >>>> - hdr->size = size; >>> >>> I'm missing where it's assigned. >>> Looks like hdr->size stays zero, so free is never performed? >> >> This is read only memory, so we set it in bpf_fill_ill_insns(). There was a >> comment in x86/bpf_jit_comp.c. I guess we also need a comment here. > > Ahh. Found it. Pls don't do it in fill_insn. > It's the wrong layering. > It feels that callbacks need to be redesigned. > I would operate on rw_header here and use > existing arch specific callback fill_insn to write into rw_image. > Both insns during JITing and 0xcc on both sides of the prog. > Populate rw_header->size (either before or after JITing) > and then do single text_poke_copy to write the whole thing > into the correct spot. In this way, we need to allocate rw_image here, and free it in bpf_jit_comp.c. This feels a little weird to me, but I guess that is still the cleanest solution for now. Thanks, Song