> On Nov 17, 2021, at 3:57 PM, Song Liu <songliubraving@xxxxxx> wrote: > > > >> On Nov 17, 2021, at 2:01 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: >> >> On Wed, Nov 17, 2021 at 09:36:27PM +0000, Song Liu wrote: >>> >>> >>>> On Nov 16, 2021, at 12:00 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: >>>> >>>> On Mon, Nov 15, 2021 at 11:13:42PM -0800, Song Liu wrote: >>>>> These allow setting ro/x for module_alloc() mapping, while leave the >>>>> linear mapping rw/nx. >>>> >>>> This needs a very strong rationale for *why*. How does this not >>>> trivially circumvent W^X ? >>> >>> In this case, we want to have multiple BPF programs sharing the 2MB page. >>> When the JIT engine is working on one program, we would rather existing >>> BPF programs on the same page stay on RO+X mapping (the module_alloc() >>> address). The solution in this version is to let the JIT engine write to >>> the page via linear address. >>> >>> An alternative is to only use the module_alloc() address, and flip the >>> read-only bit (of the whole 2MB page) back and forth. However, this >>> requires some serialization among different JIT jobs. >> >> Neither options seem acceptible to me as they both violate W^X. >> >> Please have a close look at arch/x86/kernel/alternative.c:__text_poke() >> for how we modify active text. I think that or something very similar is >> the only option. By having an alias in a special (user) address space >> that is not accessible by any other CPU, only the poking CPU can expoit >> this (temporary) hole, which is a much larger ask than any of the >> proposed options. > > I would agree that __text_poke() is a safer option. But in this case, we > will need the temporary hole to be 2MB in size. Also, we will probably > hold the temporary mapping for longer time (the whole JITing process). > Does this sound reasonable? Actually, the hole is probably not always 2MB in size. But it could be up to 2MB in size. Song