Re: [PATCH 12/13] x86/jitalloc: prepare to allocate exectuatble memory as ROX

"Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> · Thu, 1 Jun 2023 18:13:44 +0000

On Thu, 2023-06-01 at 14:00 -0400, Kent Overstreet wrote:
> On Thu, Jun 01, 2023 at 04:54:27PM +0000, Edgecombe, Rick P wrote:
> > It is just a local flush, but I wonder how much text_poke()ing is
> > too
> > much. A lot of the are even inside loops. Can't it do the batch
> > version
> > at least?
> > 
> > The other thing, and maybe this is in paranoia category, but it's
> > probably at least worth noting. Before the modules were not made
> > executable until all of the code was finalized. Now they are made
> > executable in an intermediate state and then patched later. It
> > might
> > weaken the CFI stuff, but also it just kind of seems a bit
> > unbounded
> > for dealing with executable code.
> 
> I believe bpf starts out by initializing new executable memory with
> illegal opcodes, maybe we should steal that and make it standard.

I was thinking of modules which have a ton of alternatives, errata
fixes, etc applied to them after the initial sections are written to
the to-be-executable mapping. I thought this had zeroed pages to start,
which seems ok.

> 
> > Preparing the modules in a separate RW mapping, and then
> > text_poke()ing
> > the whole thing in when you are done would resolve both of these.
> 
> text_poke() _does_ create a separate RW mapping.

Sorry, I meant a separate RW allocation.

> 
> The thing that sucks about text_poke() is that it always does a full
> TLB
> flush, and AFAICT that's not remotely needed. What it really wants to
> be
> doing is conceptually just
> 
> kmap_local()
> mempcy()
> kunmap_loca()
> flush_icache();
> 
> ...except that kmap_local() won't actually create a new mapping on
> non-highmem architectures, so text_poke() open codes it.

Text poke creates only a local CPU RW mapping. It's more secure because
other threads can't write to it. It also only needs to flush the local
core when it's done since it's not using a shared MM. It used to use
the fixmap, which is similar to what you are describing I think.