Re: [PATCH v2] x86/kexec: Add EFI config table identity mapping for kexec kernel

Tao Liu <ltao@xxxxxxxxxx> · Mon, 17 Jul 2023 23:02:43 +0800

Hi Ard,

Thanks for your explanation!

On Thu, Jul 13, 2023 at 6:18 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
>
> On Fri, 7 Jul 2023 at 19:12, Borislav Petkov <bp@xxxxxxxxx> wrote:
> >
> > On Fri, Jul 07, 2023 at 10:25:15AM -0500, Michael Roth wrote:
> > > ...
> > > It would be unfortunate if we finally abandoned this path because of the
> > > issue being hit here though. I think the patch posted here is the proper
> > > resolution to the issue being hit, and I'm hoping at this point we've
> > > identified all the similar cases where EFI/setup_data-related structures
> > > were missing explicit mappings. But if we still think it's too much of a
> > > liability to access the EFI config table outside of SEV-enabled guests,
> > > then I can work on re-implementing things based on the above logic.
> >
> > Replying here to Tom's note too...
> >
> > So, I like the idea of rechecking CPUID. Yes, let's do the sev_status
> > check. As a result, we either fail the guest - no problem - or we boot
> > and we recheck. Thus, we don't run AMD code on !AMD machines, if the HV
> > is not a lying bastard.
> >
> > Now, if we've gotten a valid setup_data SETUP_EFI entry with a valid
> > pointer to an EFI config table, then that should happen in the generic
> > path - initialize_identity_maps(), for example - like you've done in
> > b57feed2cc26 - not in the kexec code because kexec *happens* to need it.
> >
> > We want to access the EFI config table? Sure, by all means, but make
> > that generic for all code.
> >
>
> OK, so in summary, what seems to be happening here is that the SEV
> init code in the decompressor looks for the cc blob table before the
> on-demand mapping code is up, which normally ensures that any RAM
> address is accessible even if it hasn't been mapped explicitly.
>
Yes it is exactly the case.

> This is why the fix happens to work: the code only maps the array of
> (guid, phys_addr) tuples that describes the list of configuration
> tables that have been provided by the firmware. The actual
> configuration tables themselves could be anywhere in physical memory,
> and without prior knowledge of a particular GUID value, there is no
> way to know the size of the table, and so they cannot be mapped

Should we loop map each element of the config table one at a time? We
read a GUID value, then we map it with phys_addr, phys_addr +
sizeof(struct), then we read-map the next one, in this way we may not
need to know the size of the table?

> upfront like this. However, the cc blob table does not exist on this
> machine, and so whether the EFI config tables themselves are mapped or
> not is irrelevant.

Currently we don't need all the data of the config table, only part of
it. So only  (guid, phys_addr) tuple arrays are mapped into. Won't it
be better if we map config table on demand if we need further data
later?

>
> But it does mean the fix is incomplete, and certainly does not belong
> in generic kexec code. If anything, we should be fixing the
> decompressor code to defer the cc blob table check until after the
> demand mapping code is up.

Yes, if we can defer the cc blob access, the issue can be resolved. I
don't know if it is doable from AMD's view...
>
> If this is problematic, we might instead disable SEV for kexec, and
> rely on the fact that SEV firmware enters with a complete 1:1 map (as

We still need sev support for kexec. If we disable sev during kexec, I
suppose kdump may be broken on a sev guest, maybe?

Thanks,
Tao Liu

> we seem to be doing currently). If kexec for SEV is needed at some
> point, we can re-enable it by having it provide a mapping for the
> config table array and the cc blob table explicitly.
>