> On Nov 18, 2021, at 10:28 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Thu, Nov 18, 2021 at 05:16:24PM +0000, Song Liu wrote: >> >> >>> On Nov 17, 2021, at 11:54 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: >>> >>> On Wed, Nov 17, 2021 at 11:57:12PM +0000, Song Liu wrote: >>> >>>> I would agree that __text_poke() is a safer option. But in this case, we >>>> will need the temporary hole to be 2MB in size. Also, we will probably >>>> hold the temporary mapping for longer time (the whole JITing process). >>>> Does this sound reasonable? >>> >>> No :-) >>> >>> Jit to a buffer, then copy the buffer into the 2M page using 4k aliases. >>> IIRC each program is still smaller than a single page, right? So at no >>> point do you need more than 2 pages mapped anyway. >> >> JITing to a separate buffer adds complexity to the JIT process, as we >> need to redo some offsets before the copy to match the final location of >> the program. I don't have much experience with the JIT engine, so I am >> not very sure how much work it gonna be. > > You're going to have to do that anyway if you're going to write to the > directmap while executing from the alias. Not really. If you look at current version 7/7, the logic is mostly straightforward. We just make all the writes to the directmap, while calculate offset from the alias. > >> The BPF program could have up to 1000000 (BPF_COMPLEXITY_LIMIT_INSNS) >> instructions (BPF instructions). So it could easily go beyond a few >> pages. Mapping the 2MB page all together should make the logic simpler. > > Then copy it in smaller chunks I suppose. How fast/slow is the __text_poke routine? I guess we cannot do it thousands of times per BPF program (in chunks of a few bytes)? Thanks, Song