On Sun, 22 Dec 2019 at 13:46, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > > > > > On Dec 22, 2019, at 8:02 PM, Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote: > > > > On Sat, 21 Dec 2019 at 22:22, Hans de Goede <hdegoede@xxxxxxxxxx> wrote: > >> > >> Hi Ard, > >> > >>> On 18-12-2019 18:01, Ard Biesheuvel wrote: > >>> We use special wrapper routines to invoke firmware services in the > >>> native case as well as the mixed mode case. For mixed mode, the need > >>> is obvious, but for the native cases, we can simply rely on the > >>> compiler to generate the indirect call, given that GCC now has > >>> support for the MS calling convention (and has had it for quite some > >>> time now). Note that on i386, the decompressor and the EFI stub are not > >>> built with -mregparm=3 like the rest of the i386 kernel, so we can > >>> safely allow the compiler to emit the indirect calls here as well. > >>> > >>> So drop all the wrappers and indirection, and switch to either native > >>> calls, or direct calls into the thunk routine for mixed mode. > >>> > >>> Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx> > >> > >> I'm afraid that this patch breaks the boot on one of my machines. > >> > >> Specifically this patch breaks my GDP pocket machine. This is a Cherry > >> Trail device with a 64 UEFI running a 64 bit kernel build. > >> > >> As soon as I cherry pick this patch into my personal 5.5.0-rc2 based > >> tree, the GPD pocket stops booting and it stop so early on that I get 0 > >> debug output. I guess I could try adding a few pr_efi_err calls > >> and see if those still do something. > >> > >> I noticed that you have made some changes to this patch, I've > >> tried updating it to the version from your efistub-x86-cleanup-v3 > >> branch, commit id a37d90a2c570a25926fd1645482cb9f3c1d042a0 > >> and I have also cherry-picked the latest version of all preceding > >> commits, unfortunately even with the new version, the GPD pocket > >> still hangs at boot. > >> > >> Unfortunately the nature of this patch makes it hard to figure > >> out the root cause of this issue... > >> > >> I've also tried another Cherry Trail device with 64 bit UEFI and > >> that does not suffer from this problem. > >> > > > > Thanks Hans. > > > > There are a number of things that change in the way the calls are > > made, but the most obvious thing to check is whether the stack needs > > to be aligned, since that is no longer being done. > > > > If you have time to experiment a bit more, could you check whether > > doing 'and $~0xf %rsp' before 'call efi_main' in the .S stub code for > > x86_64 makes a difference? > > Hmm. Most of the kernel is compiled with the stack alignment set to 8, and there a lot of asm that makes no effort to preserve alignment beyond 8 bytes. So if EFI calls need 16 byte alignment, you may need to do something special. > > On new enough gcc (the versions that actually support the flags to set the alignment to 8), maybe you can use function attributes, or maybe you can stick a 16-byte-aligned local variable in functions that call EFI functions? The latter would be rather fragile. This patch replaces open coded SysV to MS calling convention translation to GCC generated code (using __attribute__((ms_abi)) which we have been using for a long time in EDK2), because the former relies on a wrapper function efi_call(fn, ...) which is type unsafe and relies on a lot of nasty casting, especially combined with the mixed mode support. efi_call() is implemented as below, and as Hans reports, omitting this sequence causes a boot regression on one of the platforms he has tested this on. So the question is which of the pieces below this UEFI implementation is actually relying on, and the stack pointer alignment is my first guess, but it could be any of the other things as well. Once we identify what it is we are missing, I can simply stick it back in, but without reverting to using the efi_call() thunk. Note that the decompressor/stub are built with the default stack alignment of 16 afaict, but if GRUB enters the decompressor with a misaligned stack, we probably wouldn't notice until we hit something like a movaps, right? Thanks, Ard. #define SAVE_XMM \ mov %rsp, %rax; \ subq $0x70, %rsp; \ and $~0xf, %rsp; \ mov %rax, (%rsp); \ mov %cr0, %rax; \ clts; \ mov %rax, 0x8(%rsp); \ movaps %xmm0, 0x60(%rsp); \ movaps %xmm1, 0x50(%rsp); \ movaps %xmm2, 0x40(%rsp); \ movaps %xmm3, 0x30(%rsp); \ movaps %xmm4, 0x20(%rsp); \ movaps %xmm5, 0x10(%rsp) #define RESTORE_XMM \ movaps 0x60(%rsp), %xmm0; \ movaps 0x50(%rsp), %xmm1; \ movaps 0x40(%rsp), %xmm2; \ movaps 0x30(%rsp), %xmm3; \ movaps 0x20(%rsp), %xmm4; \ movaps 0x10(%rsp), %xmm5; \ mov 0x8(%rsp), %rsi; \ mov %rsi, %cr0; \ mov (%rsp), %rsp SYM_FUNC_START(efi_call) pushq %rbp movq %rsp, %rbp SAVE_XMM mov 16(%rbp), %rax subq $48, %rsp mov %r9, 32(%rsp) mov %rax, 40(%rsp) mov %r8, %r9 mov %rcx, %r8 mov %rsi, %rcx call *%rdi addq $48, %rsp RESTORE_XMM popq %rbp ret SYM_FUNC_END(efi_call)