> On Jun 14, 2019, at 3:06 PM, Dave Hansen <dave.hansen@xxxxxxxxx> wrote: > >> On 6/14/19 2:34 PM, Yu-cheng Yu wrote: >> On Fri, 2019-06-14 at 13:57 -0700, Dave Hansen wrote: >>>> I have a related question: >>>> >>>> Do we allow the application to read the bitmap, or any fault from the >>>> application on bitmap pages? >>> >>> We have to allow apps to read it. Otherwise they can't execute >>> instructions. >> >> What I meant was, if an app executes some legacy code that results in bitmap >> lookup, but the bitmap page is not yet populated, and if we then populate that >> page with all-zero, a #CP should follow. So do we even populate that zero page >> at all? >> >> I think we should; a #CP is more obvious to the user at least. > > Please make an effort to un-Intel-ificate your messages as much as > possible. I'd really prefer that folks say "missing end branch fault" > rather than #CP. I had to Google "#CP". > > I *think* you are saying that: The *only* lookups to this bitmap are on > "missing end branch" conditions. Normal, proper-functioning code > execution that has ENDBR instructions in it will never even look at the > bitmap. The only case when we reference the bitmap locations is when > the processor is about do do a "missing end branch fault" so that it can > be suppressed. Any population with the zero page would be done when > code had already encountered a "missing end branch" condition, and > populating with a zero-filled page will guarantee that a "missing end > branch fault" will result. You're arguing that we should just figure > this out at fault time and not ever reach the "missing end branch fault" > at all. > > Is that right? > > If so, that's an architecture subtlety that I missed until now and which > went entirely unmentioned in the changelog and discussion up to this > point. Let's make sure that nobody else has to walk that path by > improving our changelog, please. > > In any case, I don't think this is worth special-casing our zero-fill > code, FWIW. It's not performance critical and not worth the complexity. > If apps want to handle the signals and abuse this to fill space up with > boring page table contents, they're welcome to. There are much easier > ways to consume a lot of memory. Isn’t it a special case either way? Either we look at CR2 and populate a page, or we look at CR2 and the “tracker” state and send a different signal. Admittedly the former is very common in the kernel. > >>> We don't have to allow them to (popuating) fault on it. But, if we >>> don't, we need some kind of kernel interface to avoid the faults. >> >> The plan is: >> >> * Move STACK_TOP (and vdso) down to give space to the bitmap. > > Even for apps with 57-bit address spaces? > >> * Reserve the bitmap space from (mm->start_stack + PAGE_SIZE) to cover a code >> size of TASK_SIZE_LOW, which is (TASK_SIZE_LOW / PAGE_SIZE / 8). > > The bitmap size is determined by CR4.LA57, not the app. If you place > the bitmap here, won't references to it for high addresses go into the > high address space? > > Specifically, on a CR4.LA57=0 system, we have 48 bits of address space, > so 128TB for apps. You are proposing sticking the bitmap above the > stack which is near the top of that 128TB address space. But on a > 5-level paging system with CR4.LA57=1, there could be valid data at > 129GB. Is there something keeping that data from being mistaken for > being part of the bitmap? > I think we need to make the vma be full sized — it should cover the entire range that the CPU might access. If that means it spans the 48-bit boundary, so be it. > Also, if you're limiting it to TASK_SIZE_LOW, please don't forget that > this is yet another thing that probably won't work with the vsyscall > page. Please make sure you consider it and mention it in your next post. Why not? The vsyscall page is at a negative address. > >> * Mmap the space only when the app issues the first mark-legacy prctl. This >> avoids the core-dump issue for most apps and the accounting problem that >> MAP_NORESERVE probably won't solve What happens if there’s another VMA there by the time you map it?