On Tue, Jul 21, 2020 at 9:06 PM Palmer Dabbelt <palmer@xxxxxxxxxxx> wrote: > > On Tue, 21 Jul 2020 11:36:10 PDT (-0700), alex@xxxxxxxx wrote: > > Let's try to make progress here: I add linux-mm in CC to get feedback on > > this patch as it blocks sv48 support too. > > Sorry for being slow here. I haven't replied because I hadn't really fleshed > out the design yet, but just so everyone's on the same page my problems with > this are: > > * We waste vmalloc space on 32-bit systems, where there isn't a lot of it. There is actually an ongoing work to make 32-bit Arm kernels move vmlinux into the vmalloc space, as part of the move to avoid highmem. Overall, a 32-bit system would waste about 0.1% of its virtual address space by having the kernel be located in both the linear map and the vmalloc area. It's not zero, but not that bad either. With the typical split of 3072 MB user, 768MB linear and 256MB vmalloc, it's also around 1.5% of the available vmalloc area (assuming a 4MB vmlinux in a typical 32-bit kernel), but the boundaries can be changed arbitrarily if needed. The eventual goal is to have a split of 3840MB for either user or linear map plus and 256MB for vmalloc, including the kernel. Switching between linear and user has a noticeable runtime overhead, but it relaxes both the limits for user memory and lowmem, and it provides a somewhat stronger address space isolation. Another potential idea would be to completely randomize the physical addresses underneath the kernel by using a random permutation of the pages in the kernel image. This adds even more overhead (virt_to_phys may need to call vmalloc_to_page or similar) and may cause problems with DMA into kernel .data across page boundaries, > * Sort out how to maintain a linear map as the canonical hole moves around > between the VA widths without adding a bunch of overhead to the virt2phys and > friends. This is probably going to be the trickiest part, but I think if we > just change the page table code to essentially lie about VAs when an sv39 > system runs an sv48+sv39 kernel we could make it work -- there'd be some > logical complexity involved, but it would remain fast. I assume you can't use the trick that x86 has where all kernel addresses are at the top of the 64-bit address space and user addresses are at the bottom, regardless of the size of the page tables? Arnd