On Wed, Jan 04, 2023 at 05:32:18PM +0100, Ard Biesheuvel wrote: > On Wed, 4 Jan 2023 at 17:30, Mark Rutland <mark.rutland@xxxxxxx> wrote: > > > > On Wed, Jan 04, 2023 at 05:15:34PM +0100, Ard Biesheuvel wrote: > > > On Wed, 4 Jan 2023 at 17:13, Mark Rutland <mark.rutland@xxxxxxx> wrote: > > > > > > > > On Wed, Jan 04, 2023 at 02:56:19PM +0100, Ard Biesheuvel wrote: > > > > > On Wed, 4 Jan 2023 at 11:40, Lee Jones <lee@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Mon, 05 Dec 2022, Ard Biesheuvel wrote: > > > > > > > > > > > > > With the introduction of PRMT in the ACPI subsystem, the EFI rts > > > > > > > workqueue is no longer the only caller of efi_call_virt_pointer() in the > > > > > > > kernel. This means the EFI runtime services lock is no longer sufficient > > > > > > > to manage concurrent calls into firmware, but also that firmware calls > > > > > > > may occur that are not marshalled via the workqueue mechanism, but > > > > > > > originate directly from the caller context. > > > > > > > > > > > > > > For added robustness, and to ensure that the runtime services have 8 KiB > > > > > > > of stack space available as per the EFI spec, introduce a spinlock > > > > > > > protected EFI runtime stack of 8 KiB, where the spinlock also ensures > > > > > > > serialization between the EFI rts workqueue (which itself serializes EFI > > > > > > > runtime calls) and other callers of efi_call_virt_pointer(). > > > > > > > > > > > > > > While at it, use the stack pivot to avoid reloading the shadow call > > > > > > > stack pointer from the ordinary stack, as doing so could produce a > > > > > > > gadget to defeat it. > > > > > > > > > > > > > > Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx> > > > > > > > --- > > > > > > > arch/arm64/include/asm/efi.h | 3 +++ > > > > > > > arch/arm64/kernel/efi-rt-wrapper.S | 13 +++++++++- > > > > > > > arch/arm64/kernel/efi.c | 25 ++++++++++++++++++++ > > > > > > > 3 files changed, 40 insertions(+), 1 deletion(-) > > > > > > > > > > > > Could we have this in Stable please? > > > > > > > > > > > > Upstream commit: ff7a167961d1b ("arm64: efi: Execute runtime services from a dedicated stack") > > > > > > > > > > > > Ard, do we need Patch 2 as well, or can this be applied on its own? > > > > > > > > > > > > > > > > Thanks for the reminder. > > > > > > > > > > Only patch #1 is needed. It should be applied to v5.10 and later. > > > > > > > > Hold on, why did this go into mainline when I had an outstanding comment w.r.t. > > > > the stack unwinder? > > > > > > > > From your last reply to me there I was expecting a respin with that fixed. > > > > > > > > > > Apologies for the confusion. > > > > > > I have a patch for this queued up, but AIUI, that cannot be merged all > > > the way back to v5.10, so these need to remain separate changes in any > > > case. > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=c2530a04a73e6b75ed71ed14d09d7b42d6300013 > > > > Ah, ok, thanks for the pointer! > > > > I'm a little uneasy here, still. > > > > By backporting this we're also backporting the new breakage of the stack > > unwinder, and the minimal change for backports would be to add the lock and not > > the new stack (which was added for additinoal robustness, not to fix the bug > > the lock fixes). > > > > I do appreciate that the additional stack is likely more useful than the > > occasional diagnostic output from the kernel, but it does seem like this has > > traded off one bug for another, and I'm just a little annoyed because I pointed > > that out before the first pull request was made. > > > > I do know that this isn't malicious, and I'm not trying to start a fight, but > > now we have to consider whether we want/need to backport a stack unwinder fix > > to account for this, and we hadn't had that discussion before. > > In that case, let's drop these backports for the time being, and > collaborate on a solution that works for all of us. Thanks! IIUC our options here are: 1) Create a cut-down patch for stable that just adds the new lock but leaves out the new stack. I may be missing a reason why that's insufficient or painful. 2) Backport this *but* also backport the follow-up fixes from your other series: https://lore.kernel.org/r/20230104174433.1259428-1-ardb@xxxxxxxxxx Above you mentioned something about v5.10, was that just to say that some manual backporting was required, or that there was a structural problem that would require more invasive changes / prerequisites? 3) Something else? My preference would be (1), but if we are encountering issue with stack size on stable kernels, then I'd be happy to help with manual backporting effort for (2), as long as we backported all the relevant bits in one go. Does that make sense, and does that sound reasonable to you? Thanks, Mark.