On Mon, May 11, 2020 at 12:42 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Sat, May 09, 2020 at 12:05:29PM -0700, Andy Lutomirski wrote: > > > On x86_64, the only real advantage is that the handful of corner cases > > that make vmalloc faults unpleasant (mostly relating to vmap stacks) > > go away. On x86_32, a bunch of mind-bending stuff (everything your > > series deletes but also almost everything your series *adds*) goes > > away. There may be a genuine tiny performance hit on 2-level systems > > due to the loss of huge pages in vmalloc space, but I'm not sure I > > care or that we use them anyway on these systems. And PeterZ can stop > > even thinking about RCU. > > > > Am I making sense? > > I think it'll work for x86_64 and that is really all I care about :-) Sadly, I think that Joerg has convinced my that this doesn't really work for 32-bit unless we rework the LDT code or drop support for something that we might not want to drop support for. So, last try -- maybe we can start defeaturing 32-bit: What if we make 32-bit PTI depend on PAE? And drop 32-bit Xen PV support? And make 32-bit huge pages depend on PAE? Then 32-bit non-PAE can use the direct-mapped LDT, 32-bit PTI (and optionally PAE non-PTI) can use the evil virtually mapped LDT. And 32-bit non-PAE (the 2-level case) will only have pointers to page tables at the top level. And then we can preallocate. Or maybe we don't want to defeature this much, or maybe the memory hit from this preallocation will hurt little 2-level 32-bit systems too much. (Xen 32-bit PV support seems to be on its way out upstream.)