On 02.08.21 12:19, Joerg Roedel wrote:
On Tue, Jul 27, 2021 at 11:34:47AM +0200, David Hildenbrand wrote:
What makes you think that? I already heard people express desires for memory
hot(un)plug, especially in the context of running containers inside
encrypted VMs. And static bitmaps are naturally a bad choice for changing
memory layouts.
In the worst case some memory in the bitmap is wasted when memory is
hot-unplugged. The amount depends on how much memory one bit covers, but
I don't see this as a show stopper.
Devil's in the details when you want to hotplug later; for example,
before parsing SRAT, we have no clue how much memory we might have at
one point at runtime later. And you'd have to prepare for that by
allocating the bitmap accordingly. And as I said, it's not a sparse data
structure, so you will at least waste some memory.
I'm wondering, why exactly would a kdump kernel (not touching memory of the
old kernel while booting up) need access to the bitmap? Just wondering, for
ACPI tables and such? I can understand why makedumpfile would need that
information when actually dumping memory of the old kernel, but it would
have access to the memmap of the old kernel to obtain that information.
The kdump kernel needs the bitmap to detect when the Hypervisor is doing
something malicious, well, at least on its own memory. The kdump kernel
has full access to the previous kernels memory and could also be tricked
by the Hypervisor to reveal secrets.
That's an interesting thought. But this raises many questions, how and
what to dump in context of encrypted VMs at all. I'd love to see some
writeup of what we actually want to dump, with which tools, and to which
(encrypted?) locations.
The kdump kernel has access to the memmap of the old kernel. The memmap
of the old kernel would contain information regarding encrypted pages.
The kdump kernel and the tools (makedumpfile) running in the VM cannot
be tampered with by the hypervisor. The memmap of the old kernel cannot
be tampered with, as it resides on encrypted memory. Are my assumptions
correct?
I'd be interested how a hypervisor could trigger revealing secrets.
Mirroring is a good point. But I'd suggest using the bitmap only during
early boot if really necessary and after syncing it to the bitmap, get rid
of it. Sure, kexec is more challenging, but at least it's a clean design. We
can always try expressing the state of validated memory in the e820 map we
present to the kexec kernel.
It depends on how fragmented the validated/unvalidated regions will get
over time. I think currently it is not very fragmented, the biggest
shared regions are the .bss_decrypted section and the DMA bounce buffer.
But there are also a couple of page-size regions which need to be
shared. For kexec these regions can be validated again when tearing down
the APs, but for kdump it would be too fragile to do such extensive
stuff before jumping the the kdump kernel.
Right, I don't really see a blocker for kexec, just needs some proper
creation/update of the e820 map. For kdump, I am not sure if we really
need it, but most probably if we would have a complete picture of kdump
for encrypted VMs it would get much clearer what we actually have to
care about.
--
Thanks,
David / dhildenb