Re: [PATCH bpf-next 2/7] set_memory: introduce set_memory_[ro|x]_noalias

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Nov 17, 2021, at 3:57 PM, Song Liu <songliubraving@xxxxxx> wrote:
> 
> 
> 
>> On Nov 17, 2021, at 2:01 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> 
>> On Wed, Nov 17, 2021 at 09:36:27PM +0000, Song Liu wrote:
>>> 
>>> 
>>>> On Nov 16, 2021, at 12:00 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>>> 
>>>> On Mon, Nov 15, 2021 at 11:13:42PM -0800, Song Liu wrote:
>>>>> These allow setting ro/x for module_alloc() mapping, while leave the
>>>>> linear mapping rw/nx.
>>>> 
>>>> This needs a very strong rationale for *why*. How does this not
>>>> trivially circumvent W^X ?
>>> 
>>> In this case, we want to have multiple BPF programs sharing the 2MB page. 
>>> When the JIT engine is working on one program, we would rather existing
>>> BPF programs on the same page stay on RO+X mapping (the module_alloc() 
>>> address). The solution in this version is to let the JIT engine write to 
>>> the page via linear address. 
>>> 
>>> An alternative is to only use the module_alloc() address, and flip the 
>>> read-only bit (of the whole 2MB page) back and forth. However, this 
>>> requires some serialization among different JIT jobs. 
>> 
>> Neither options seem acceptible to me as they both violate W^X.
>> 
>> Please have a close look at arch/x86/kernel/alternative.c:__text_poke()
>> for how we modify active text. I think that or something very similar is
>> the only option. By having an alias in a special (user) address space
>> that is not accessible by any other CPU, only the poking CPU can expoit
>> this (temporary) hole, which is a much larger ask than any of the
>> proposed options.
> 
> I would agree that __text_poke() is a safer option. But in this case, we 
> will need the temporary hole to be 2MB in size. Also, we will probably 
> hold the temporary mapping for longer time (the whole JITing process). 
> Does this sound reasonable?

Actually, the hole is probably not always 2MB in size. But it could be up 
to 2MB in size. 

Song




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux