Re: [PATCH bpf-next 2/7] set_memory: introduce set_memory_[ro|x]_noalias

Song Liu <songliubraving@xxxxxx> · Thu, 18 Nov 2021 18:39:49 +0000

> On Nov 18, 2021, at 10:28 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> 
> On Thu, Nov 18, 2021 at 05:16:24PM +0000, Song Liu wrote:
>> 
>> 
>>> On Nov 17, 2021, at 11:54 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>> 
>>> On Wed, Nov 17, 2021 at 11:57:12PM +0000, Song Liu wrote:
>>> 
>>>> I would agree that __text_poke() is a safer option. But in this case, we 
>>>> will need the temporary hole to be 2MB in size. Also, we will probably 
>>>> hold the temporary mapping for longer time (the whole JITing process). 
>>>> Does this sound reasonable?
>>> 
>>> No :-)
>>> 
>>> Jit to a buffer, then copy the buffer into the 2M page using 4k aliases.
>>> IIRC each program is still smaller than a single page, right? So at no
>>> point do you need more than 2 pages mapped anyway.
>> 
>> JITing to a separate buffer adds complexity to the JIT process, as we 
>> need to redo some offsets before the copy to match the final location of 
>> the program. I don't have much experience with the JIT engine, so I am
>> not very sure how much work it gonna be. 
> 
> You're going to have to do that anyway if you're going to write to the
> directmap while executing from the alias.

Not really. If you look at current version 7/7, the logic is mostly 
straightforward. We just make all the writes to the directmap, while 
calculate offset from the alias. 

> 
>> The BPF program could have up to 1000000 (BPF_COMPLEXITY_LIMIT_INSNS)
>> instructions (BPF instructions). So it could easily go beyond a few 
>> pages. Mapping the 2MB page all together should make the logic simpler. 
> 
> Then copy it in smaller chunks I suppose.

How fast/slow is the __text_poke routine? I guess we cannot do it thousands
of times per BPF program (in chunks of a few bytes)? 

Thanks,
Song