Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

Andy Lutomirski <luto@xxxxxxxxxxxxxx> · Sat, 15 Jun 2019 08:30:08 -0700

> On Jun 14, 2019, at 3:06 PM, Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> 
>> On 6/14/19 2:34 PM, Yu-cheng Yu wrote:
>> On Fri, 2019-06-14 at 13:57 -0700, Dave Hansen wrote:
>>>> I have a related question:
>>>> 
>>>> Do we allow the application to read the bitmap, or any fault from the
>>>> application on bitmap pages?
>>> 
>>> We have to allow apps to read it.  Otherwise they can't execute
>>> instructions.
>> 
>> What I meant was, if an app executes some legacy code that results in bitmap
>> lookup, but the bitmap page is not yet populated, and if we then populate that
>> page with all-zero, a #CP should follow.  So do we even populate that zero page
>> at all?
>> 
>> I think we should; a #CP is more obvious to the user at least.
> 
> Please make an effort to un-Intel-ificate your messages as much as
> possible.  I'd really prefer that folks say "missing end branch fault"
> rather than #CP.  I had to Google "#CP".
> 
> I *think* you are saying that:  The *only* lookups to this bitmap are on
> "missing end branch" conditions.  Normal, proper-functioning code
> execution that has ENDBR instructions in it will never even look at the
> bitmap.  The only case when we reference the bitmap locations is when
> the processor is about do do a "missing end branch fault" so that it can
> be suppressed.  Any population with the zero page would be done when
> code had already encountered a "missing end branch" condition, and
> populating with a zero-filled page will guarantee that a "missing end
> branch fault" will result.  You're arguing that we should just figure
> this out at fault time and not ever reach the "missing end branch fault"
> at all.
> 
> Is that right?
> 
> If so, that's an architecture subtlety that I missed until now and which
> went entirely unmentioned in the changelog and discussion up to this
> point.  Let's make sure that nobody else has to walk that path by
> improving our changelog, please.
> 
> In any case, I don't think this is worth special-casing our zero-fill
> code, FWIW.  It's not performance critical and not worth the complexity.
> If apps want to handle the signals and abuse this to fill space up with
> boring page table contents, they're welcome to.  There are much easier
> ways to consume a lot of memory.

Isn’t it a special case either way?  Either we look at CR2 and populate a page, or we look at CR2 and the “tracker” state and send a different signal.  Admittedly the former is very common in the kernel.

> 
>>> We don't have to allow them to (popuating) fault on it.  But, if we
>>> don't, we need some kind of kernel interface to avoid the faults.
>> 
>> The plan is:
>> 
>> * Move STACK_TOP (and vdso) down to give space to the bitmap.
> 
> Even for apps with 57-bit address spaces?
> 
>> * Reserve the bitmap space from (mm->start_stack + PAGE_SIZE) to cover a code
>> size of TASK_SIZE_LOW, which is (TASK_SIZE_LOW / PAGE_SIZE / 8).
> 
> The bitmap size is determined by CR4.LA57, not the app.  If you place
> the bitmap here, won't references to it for high addresses go into the
> high address space?
> 
> Specifically, on a CR4.LA57=0 system, we have 48 bits of address space,
> so 128TB for apps.  You are proposing sticking the bitmap above the
> stack which is near the top of that 128TB address space.  But on a
> 5-level paging system with CR4.LA57=1, there could be valid data at
> 129GB.  Is there something keeping that data from being mistaken for
> being part of the bitmap?
> 

I think we need to make the vma be full sized — it should cover the entire range that the CPU might access. If that means it spans the 48-bit boundary, so be it.

> Also, if you're limiting it to TASK_SIZE_LOW, please don't forget that
> this is yet another thing that probably won't work with the vsyscall
> page.  Please make sure you consider it and mention it in your next post.

Why not?  The vsyscall page is at a negative address.

> 
>> * Mmap the space only when the app issues the first mark-legacy prctl.  This
>> avoids the core-dump issue for most apps and the accounting problem that
>> MAP_NORESERVE probably won't solve

What happens if there’s another VMA there by the time you map it?