On 10/14/24 03:55, Ryan Roberts wrote:
Hi All, Patch bomb incoming... This covers many subsystems, so I've included a core set of people on the full series and additionally included maintainers on relevant patches. I haven't included those maintainers on this cover letter since the numbers were far too big for it to work. But I've included a link to this cover letter on each patch, so they can hopefully find their way here. For follow up submissions I'll break it up by subsystem, but for now thought it was important to show the full picture. This RFC series implements support for boot-time page size selection within the arm64 kernel. arm64 supports 3 base page sizes (4K, 16K, 64K), but to date, page size has been selected at compile-time, meaning the size is baked into a given kernel image. As use of larger-than-4K page sizes become more prevalent this starts to present a problem for distributions. Boot-time page size selection enables the creation of a single kernel image, which can be told which page size to use on the kernel command line. Why is having an image-per-page size problematic? ================================================= Many traditional distros are now supporting both 4K and 64K. And this means managing 2 kernel packages, along with drivers for each. For some, it means multiple installer flavours and multiple ISOs. All of this adds up to a less-than-ideal level of complexity. Additionally, Android now supports 4K and 16K kernels. I'm told having to explicitly manage their KABI for each kernel is painful, and the extra flash space required for both kernel images and the duplicated modules has been problematic. Boot-time page size selection solves all of this. Additionally, in starting to think about the longer term deployment story for D128 page tables, which Arm architecture now supports, a lot of the same problems need to be solved, so this work sets us up nicely for that. So what's the down side? ======================== Well nothing's free; Various static allocations in the kernel image must be sized for the worst case (largest supported page size), so image size is in line with size of 64K compile-time image. So if you're interested in 4K or 16K, there is a slight increase to the image size. But I expect that problem goes away if you're compressing the image - its just some extra zeros. At boot-time, I expect we could free the unused static storage once we know the page size - although that would be a follow up enhancement. And then there is performance. Since PAGE_SIZE and friends are no longer compile-time constants, we must look up their values and do arithmetic at runtime instead of compile-time. My early perf testing suggests this is inperceptible for real-world workloads, and only has small impact on microbenchmarks - more on this below. Approach ======== The basic idea is to rid the source of any assumptions that PAGE_SIZE and friends are compile-time constant, but in a way that allows the compiler to perform the same optimizations as was previously being done if they do turn out to be compile-time constant. Where constants are required, we use limits; PAGE_SIZE_MIN and PAGE_SIZE_MAX. See commit log in patch 1 for full description of all the classes of problems to solve. By default PAGE_SIZE_MIN=PAGE_SIZE_MAX=PAGE_SIZE. But an arch may opt-in to boot-time page size selection by defining PAGE_SIZE_MIN & PAGE_SIZE_MAX. arm64 does this if the user selects the CONFIG_ARM64_BOOT_TIME_PAGE_SIZE Kconfig, which is an alternative to selecting a compile-time page size. When boot-time page size is active, the arch pgtable geometry macro definitions resolve to something that can be configured at boot. The arm64 implementation in this series mainly uses global, __ro_after_init variables. I've tried using alternatives patching, but that performs worse than loading from memory; I think due to code size bloat.
FWIW, this paragraph was not entirely clear to me until I looked at patch 57 to see that the compile time page size selection had been retained, and could continue to be used as-is. It was somewhat implicit, but not IMHO explicit enough, not a big deal though.
Great work, thanks for doing that! This makes me wonder if we could leverage any of that to have a single kernel supporting both LPAE and !LPAE on ARM 32-bit, but that still seems like somewhat more difficult, largely due to the difference in the page table descriptor format (long vs. short).
-- Florian