On Tue, 30 Mar 2021 at 14:44, Marc Zyngier <maz@xxxxxxxxxx> wrote: > > On Tue, 30 Mar 2021 12:21:26 +0100, > Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > Commit f4693c2716b35d08 ("arm64: mm: extend linear region for 52-bit VA > > configurations") introduced a new layout for the 52-bit VA space, in > > order to maximize the space available to the linear region. After this > > change, the kernel VA space is no longer split 1:1 down the middle, and > > as it turns out, this violates an assumption in the KVM init code when > > it chooses the layout for the nVHE EL2 mapping. > > > > Given that EFI does not support 52-bit VA addressing (as it only > > supports 4k pages), and that in general, loaders cannot assume that the > > kernel being loaded supports 52-bit VA/PA addressing in the first place, > > we can safely assume that the kernel, and therefore the .idmap section, > > will be 48-bit addressable on 52-bit VA capable systems. > > > > So in this case, organize the nVHE EL2 address space as a 2^48 byte > > window starting at address 0x0, containing the ID map and the > > hypervisor's private mappings, followed by a contiguous 2^52 - 2^48 byte > > linear region. (Note that EL1's linear region is 2^52 - 2^47 bytes in > > size, so it is slightly larger, but this only matters on systems where > > the DRAM footprint in the physical memory map exceeds 3968 TB) > > So if I have memory in the [2^52 - 2^48, 2^52 - 2^47] range, not > necessarily because I have that much memory, but because my system has > multiple memory banks, one of which lands on that spot, I cannot map > such memory at EL2. We'll explode at run time. > > Can we keep the private mapping to 47 bits and restore the missing > chunk to the linear mapping? Of course, it means that the linear map > is now potential no linear anymore, so we'd have to garantee that the > kernel lines in the first 2^47 bits instead. Crap. > Yeah. The linear region needs to be contiguous. Alternatively, we could restrict the upper address limit for loading the kernel to 47 bits. > > > > Fixes: f4693c2716b35d08 ("arm64: mm: extend linear region for 52-bit VA configurations") > > Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx> > > --- > > Documentation/arm64/booting.rst | 6 +++--- > > arch/arm64/kvm/va_layout.c | 18 ++++++++++++++---- > > 2 files changed, 17 insertions(+), 7 deletions(-) > > > > diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst > > index 7552dbc1cc54..418ec9b63d2c 100644 > > --- a/Documentation/arm64/booting.rst > > +++ b/Documentation/arm64/booting.rst > > @@ -121,8 +121,8 @@ Header notes: > > to the base of DRAM, since memory below it is not > > accessible via the linear mapping > > 1 > > - 2MB aligned base may be anywhere in physical > > - memory > > + 2MB aligned base may be anywhere in the 48-bit > > + addressable physical memory region > > Bits 4-63 Reserved. > > ============= =============================================================== > > > > @@ -132,7 +132,7 @@ Header notes: > > depending on selected features, and is effectively unbound. > > > > The Image must be placed text_offset bytes from a 2MB aligned base > > -address anywhere in usable system RAM and called there. The region > > +address in 48-bit addressable system RAM and called there. The region > > between the 2 MB aligned base address and the start of the image has no > > special significance to the kernel, and may be used for other purposes. > > At least image_size bytes from the start of the image must be free for > > diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c > > index 978301392d67..e9ab449de197 100644 > > --- a/arch/arm64/kvm/va_layout.c > > +++ b/arch/arm64/kvm/va_layout.c > > @@ -62,9 +62,19 @@ __init void kvm_compute_layout(void) > > phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start); > > u64 hyp_va_msb; > > > > - /* Where is my RAM region? */ > > - hyp_va_msb = idmap_addr & BIT(vabits_actual - 1); > > - hyp_va_msb ^= BIT(vabits_actual - 1); > > + /* > > + * On LVA capable hardware, the kernel is guaranteed to reside > > + * in the 48-bit addressable part of physical memory, and so > > + * the idmap will be located there as well. Put the EL2 linear > > + * region right after it, where it can grow upward to fill the > > + * entire 52-bit VA region. > > + */ > > + if (vabits_actual > VA_BITS_MIN) { > > + hyp_va_msb = BIT(VA_BITS_MIN); > > + } else { > > + hyp_va_msb = idmap_addr & BIT(vabits_actual - 1); > > + hyp_va_msb ^= BIT(vabits_actual - 1); > > + } > > > > tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^ > > (u64)(high_memory - 1)); > > @@ -72,7 +82,7 @@ __init void kvm_compute_layout(void) > > va_mask = GENMASK_ULL(tag_lsb - 1, 0); > > tag_val = hyp_va_msb; > > > > - if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && tag_lsb != (vabits_actual - 1)) { > > + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && tag_lsb < (vabits_actual - 1)) { > > /* We have some free bits to insert a random tag. */ > > tag_val |= get_random_long() & GENMASK_ULL(vabits_actual - 2, tag_lsb); > > } > > It seems __create_hyp_private mapping() still refers to (VA_BITS - 1) > to choose where to allocate the IO mappings, and > __pkvm_create_private_mapping() relies on similar things based on what > hyp_create_idmap(). > That was probably broken already then, given that it should refer to vabits_actual. I'll address that in a separate patch. _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm