Re: [kvm-unit-tests PATCH v2 07/10] x86: efi, smp: Transition APs from 16-bit to 32-bit mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 12, 2022, Varad Gautam wrote:
> Sending INIT/SIPI to APs from ap_init() resets them into 16-bit mode
> to loop into sipi_entry().
> 
> To drive the APs into 32-bit mode, the SIPI vector needs:
> 1. A GDT descriptor reachable from 16-bit code (gdt32_descr).
> 2. A 32-bit entrypoint reachable from 16-bit code (ap_start32).
> 3. The locations of GDT and the 32-bit entrypoint.
> 
> Setting these up at compile time (like on non-EFI builds) is not
> possible since EFI builds with -shared -fPIC and efistart64.S cannot
> reference any absolute addresses.
> 
> Relative addressing is unavailable on 16-bit mode.
> 
> Moreover, EFI may not load the 32-bit entrypoint to be reachable from
> 16-bit mode.
> 
> To overcome these problems,
> 1. Fill the GDT descriptor at runtime after relocating
>    [sipi_entry-sipi_end] to lowmem. Since sipi_entry does not know the
>    address of this descriptor, use the last two bytes of SIPI page to
>    communicate it.
> 2. Place a call gate in the GDT to point to ap_start32.
> 3. Popluate sipi_entry() to lcall to ap_start32.
> 
> With this, the APs can transition to 32-bit mode and loop at a known
> location.
> 
> Signed-off-by: Varad Gautam <varad.gautam@xxxxxxxx>
> ---
>  lib/x86/smp.c        | 56 ++++++++++++++++++++++++++++++++++++++++++++
>  x86/efi/efistart64.S | 29 ++++++++++++++++++++++-
>  2 files changed, 84 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/x86/smp.c b/lib/x86/smp.c
> index d7f5aba..5cc1648 100644
> --- a/lib/x86/smp.c
> +++ b/lib/x86/smp.c
> @@ -6,6 +6,7 @@
>  #include "apic.h"
>  #include "fwcfg.h"
>  #include "desc.h"
> +#include "asm/page.h"
>  
>  #define IPI_VECTOR 0x20
>  
> @@ -144,16 +145,71 @@ void smp_reset_apic(void)
>  	atomic_inc(&active_cpus);
>  }
>  
> +#ifdef CONFIG_EFI
> +extern u8 gdt32_descr, gdt32, gdt32_end;
> +extern u8 ap_start32;
> +#endif
> +
>  void ap_init(void)
>  {
>  	u8 *dst_addr = 0;
>  	size_t sipi_sz = (&sipi_end - &sipi_entry) + 1;
>  
> +	assert(sipi_sz < PAGE_SIZE);
> +
>  	asm volatile("cld");
>  
>  	/* Relocate SIPI vector to dst_addr so it can run in 16-bit mode. */
> +	memset(dst_addr, 0, PAGE_SIZE);
>  	memcpy(dst_addr, &sipi_entry, sipi_sz);
>  
> +#ifdef CONFIG_EFI
> +	volatile struct descriptor_table_ptr *gdt32_descr_rel;
> +	idt_entry_t *gate_descr;
> +	u16 *gdt32_descr_reladdr = (u16 *) (PAGE_SIZE - sizeof(u16));

Ah, the gdt32 name confused me for a bit.  Can we use something more unique?
Maybe efi_rm_trampoline_gdt?  Or just efi_trampoline_gdt since the "rm" part is
technically wrong (though consistent).

Oooh, and the "PAGE_SIZE - sizeof(u16)" is another instance of open coding
"PAGE_SIZE - 2".  Definitely need a #define for that.

> +
> +	/*
> +	 * gdt32_descr for CONFIG_EFI needs to be filled here dynamically
> +	 * since compile time calculation of offsets is not allowed when
> +	 * building with -shared, and rip-relative addressing is not supported
> +	 * in 16-bit mode.
> +	 *
> +	 * Use the last two bytes of SIPI page to store relocated gdt32_descr
> +	 * addr.

Ooh, I see, it's a double indirection.  Load the address to the descriptor, then
load the descriptor which holds the address to the GDT.  I kept thinking that 
2 bytes were the limit chunk of the descriptor.

Maybe instead of "relocated gdt32_descr addr", "a pointer to the relocated
descriptor used by the trampoline to load the GDT"?

> +	 */
> +	*gdt32_descr_reladdr = (&gdt32_descr - &sipi_entry);
> +
> +	gdt32_descr_rel = (struct descriptor_table_ptr *) ((u64) *gdt32_descr_reladdr);
> +	gdt32_descr_rel->limit = (u16) (&gdt32_end - &gdt32 - 1);
> +	gdt32_descr_rel->base = (ulong) ((u32) (&gdt32 - &sipi_entry));
> +
> +	/*
> +	 * EFI may not load the 32-bit AP entrypoint (ap_start32) low enough
> +	 * to be reachable from the SIPI vector. Since we build with -shared, this

Nit, please avoid "we" and other pronous, e.g. "Since KVM-unit-tests is built with..."
avoids any ambiguity.

> +	 * location needs to be fetched at runtime, and rip-relative addressing is
> +	 * not supported in 16-bit mode.
> +	 * To perform 16-bit -> 32-bit far jump, our options are:

"our options" can be tweaked to something like "Alternatives to a CALL GATE are".

> +	 * - ljmpl $cs, $label : unusable since $label is not known at build time.
> +	 * - push $cs; push $label; lret : requires an intermediate trampoline since
> +	 *	 $label must still be within 0 - 0xFFFF for 16-bit far return to work.
> +	 * - lcall into a call-gate : best suited.

I very much appreciate the comment, but it's backwards in the sense that I had no
idea what I was reading until the very end.  I.e. lead with the below "Set up a
call gate ..." blurb, then dive into the alternatives.  First and forement, help
the reader understand what is actually implemented, then call out the alternatives
and why they're inferior.

> +	 *
> +	 * Set up call gate to ap_start32 within GDT.
> +	 *
> +	 * gdt32 layout:
> +	 *
> +	 * Entry | Segment
> +	 * 0	 | NULL descr
> +	 * 1	 | Code segment descr
> +	 * 2	 | Data segment descr
> +	 * 3	 | Call gate descr
> +	 */
> +	gate_descr = (idt_entry_t *) ((u8 *)(&gdt32 - &sipi_entry)
> +		+ 3 * sizeof(gdt_entry_t));

Ah, it's just a coincidence that this GDT has the same layout as the "real" GDT,
e.g. has the same KERNEL_DS and KERNEL_CS.  It took me a while to suss out that
the APs do indeed reload a different GDT during ap_start64().

In the comment above, can you add a blurb to call out that it's purely coincidental
and not strictly required that this matches the initial part of the final GDT, and
that APs will load the final GDT in ap_start64()?  I kept trying to figure out how
this didn't break tests that use other selectors :-)

> +	set_idt_entry_t(gate_descr, sizeof(gdt_entry_t), (void *) &ap_start32,
> +		0x8 /* sel */, 0xc /* type */, 0 /* dpl */);
> +#endif
> +
>  	/* INIT */
>  	apic_icr_write(APIC_DEST_ALLBUT | APIC_DEST_PHYSICAL | APIC_DM_INIT | APIC_INT_ASSERT, 0);
>  



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux