On Sat, May 09, 2020 at 10:05:43PM -0700, Andy Lutomirski wrote: > On Sat, May 9, 2020 at 2:57 PM Joerg Roedel <joro@xxxxxxxxxx> wrote: > I spent some time looking at the code, and I'm guessing you're talking > about the 3-level !SHARED_KERNEL_PMD case. I can't quite figure out > what's going on. > > Can you explain what is actually going on that causes different > mms/pgds to have top-level entries in the kernel range that point to > different tables? Because I'm not seeing why this makes any sense. There are three cases where the PMDs are not shared on x86-32: 1) With non-PAE the top-level is already the PMD level, because the page-table only has two levels. Since the top-level can't be shared, the PMDs are also not shared. 2) For some reason Xen-PV also does not share kernel PMDs on PAE systems, but I havn't looked into why. 3) On 32-bit PAE with PTI enabled the kernel address space contains the LDT mapping, which is also different per-PGD. There is one PMD entry reserved for the LDT, giving it 2MB of address space. I implemented it this way to keep the 32-bit implementation of PTI mostly similar to the 64-bit one. > Why does it need to be partitioned at all? The only thing that comes > to mind is that the LDT range is per-mm. So I can imagine that the > PAE case with a 3G user / 1G kernel split has to have the vmalloc > range and the LDT range in the same top-level entry. Yuck. PAE with 3G user / 1G kernel has _all_ of the kernel mappings in one top-level entry (direct-mapping, vmalloc, ldt, fixmap). > If it's *just* the LDT that's a problem, we could plausibly shrink the > user address range a little bit and put the LDT in the user portion. > I suppose this could end up creating its own set of problems involving > tracking which code owns which page tables. Yeah, for the PTI case it is only the LDT that causes the unshared kernel PMDs, but even if we move the LDT somewhere else we still have two-level paging and the xen-pv case. Joerg