On 8/21/23 13:15, Steve Rutherford wrote:
On Mon, Aug 21, 2023 at 6:10 AM Tom Lendacky <thomas.lendacky@xxxxxxx> wrote:
On 8/18/23 18:34, Steve Rutherford wrote:
early_set_memory_decrypted() assumes its parameters are page aligned.
Non-page aligned calls result in additional pages being marked as
decrypted via the encryption status hypercall, which results in
consistent corruption of pages during live migration. Live
migration requires accurate encryption status information to avoid
migrating pages from the wrong perspective.
Hmmm... I'm not sure this is the proper fix. The code is actually doing
the right thing from a encyrption/decryption point of view by checking the
c-bit for the PTE associated with the virtual address and the size
(possibly crossing page boundaries).
I think the problem is on the call to early_set_mem_enc_dec_hypercall()
where it doesn't take into account the possible crossing of page
boundaries and so can under-count the number of pages, right?
Right now, if you request decryption of e.g. a non-page aligned 0x40
byte structure, it rounds the 0x40 bytes up to one page, and then
hypercalls to mark both the page it's on and the subsequent page as
decrypted (since the rounding stretches the structure onto the next
page spuriously). The arithmetic in the combination of
early_set_memory_enc_dec() and early_set_mem_enc_dec_hypercall() are
correct if they are called with page aligned vaddrs (non-page-aligned
sizes are fine iiuc).
Ah, right, correct. It is still related to how the page count is
calculated for the hypercall, though, right? The encryption/decryption
operations function properly.
If another caller of early_set_memory_decrypted() gets added, it would
need to know to do the same thing. So I just wonder if this wouldn't be
better fixed in early_set_memory_enc_dec() by using a page aligned address
and proper number of pages when calling early_set_mem_enc_dec_hypercall()
or in early_set_mem_enc_dec_hypercall() where it would take a size
argument instead of a page count and does the proper work to get a page
aligned address and proper page count.
Also, if it is the hypercall that is causing the issue, should the Fixes
tag be 064ce6c550a0 ("mm: x86: Invoke hypercall when page encryption
status is changed") since the problem is around the hypercall.
Thanks,
Tom
Thanks,
Steve
Thanks,
Tom
Fixes: 4716276184ec ("X86/KVM: Decrypt shared per-cpu variables when SEV is active")
Signed-off-by: Steve Rutherford <srutherford@xxxxxxxxxx>
---
arch/x86/kernel/kvm.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 6a36db4f79fd..a0c072d3103c 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -419,7 +419,14 @@ static u64 kvm_steal_clock(int cpu)
static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
{
- early_set_memory_decrypted((unsigned long) ptr, size);
+ /*
+ * early_set_memory_decrypted() requires page aligned parameters, but
+ * this function needs to handle ptrs offset into a page.
+ */
+ unsigned long start = PAGE_ALIGN_DOWN((unsigned long) ptr);
+ unsigned long end = (unsigned long) ptr + size;
+
+ early_set_memory_decrypted(start, end - start);
}
/*
@@ -438,6 +445,11 @@ static void __init sev_map_percpu_data(void)
return;
for_each_possible_cpu(cpu) {
+ /*
+ * Calling __set_percpu_decrypted() for each per-cpu variable is
+ * inefficent, since it may decrypt the same page multiple times.
+ * That said, it avoids the need for more complicated logic.
+ */
__set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason));
__set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time));
__set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));