On Wed, 9 Nov 2022 at 15:42, Alexandru Elisei <alexandru.elisei@xxxxxxx> wrote: > > Hi, > > On Tue, Nov 08, 2022 at 04:15:09PM +0100, Ard Biesheuvel wrote: > > Alexandru reports that his Ampere Altra machine, whose buggy firmware > > triggers a synchronous exception in its implementation of SetTime() when > > called without SetVirtualAddressMap() having been called first, doesn't > > quite recover from this, and starts spewing error messages into the log > > that are unrelated to the buggy runtime service. > > > > The driver in question is the EFI RTC driver, which should be fixed in > > any case, as flooding the log like that (or doing any logging to the > > kernel log at all on something whuch is not a severe issue) is not ok. > > > > However, in this particular case, it would be beneficial for both > > ordinary use as well as diagnostics regarding broken firmware if we only > > prevent the broken runtime service from being called again, and permit > > others (such as GetTime() which triggers the logging or the variable > > services) from being used as normal. > > > > So wire up the existing efi.runtime_supported_mask, and clear the > > service's bit in the mask if the generic runtime service wrapper > > observes a return value of EFI_ABORTED, which only happens if a service > > call is aborted due to an exception. (EFI_ABORTED is not documented as a > > valid error code for any of the EFI runtime services). > > With a kernel built from v6.1-rc4, when doing efibootmgr after the EFI panic > happens (so with runtime services disabled), this is what I get: > > # efibootmgr > Skipping unreadable variable "Boot0001": Interrupted system call > Skipping unreadable variable "Boot0002": Interrupted system call > show_order(): Interrupted system call > > And dmesg shows: > > [ 55.941312] efi: EFI Runtime Services are disabled! > > With this patch on top of v6.1-rc4: > > # efibootmgr > Skipping unreadable variable "Boot0001": Invalid argument > Skipping unreadable variable "Boot0002": Invalid argument > show_order(): Invalid argument > > Same thing happens if I cat the Boot001 efivars file. Nothing is printed > on dmesg. > OK, this strongly suggests that the EFI runtime services end up in a funny state after the crash of SetTime(), and subsequent calls to any of them no longer work as expected. > Changed efi_call_rts() to print the return value, status is > 0x8000_0000_0000_000f (or 15 in decimal if casted into an int). Tried to > debug further, but I'm not familiar with all the structs and what they > represent (for example, efi_call_virt(get_variable, args) calls > efi_call_virt_pointer(efi.runtime, get_variable, args), does it end up as > __efi_rt_asm_wrapper((efi.runtime)->get_variable, "get_variable", args?) Indeed. The value of the function pointer is used to make the indirect call, and the string is only used if an error occurs, so we can print it to the log. The remaining arguments are simply the arguments to the firmware call. > As > an aside, it would be really helpful to document the arguments for > __efi_rt_asm_wrapper. Pointers here how to debug further would be very > welcome. > If the log is completely silent, there is not a lot to debug, really. The error value you are observing is EFI_ACCESS_DENIED, and looking at the open source version of the Mt.Jade firmware, this might be the value returned from the secure world helper. One other thing I would like to try is disabling set_time specifically using a command line parameter. Btw could you share the output of dmidecode as well?