On Mon, Aug 21, 2023 at 11:54 AM Tom Lendacky <thomas.lendacky@xxxxxxx> wrote: > > On 8/21/23 13:15, Steve Rutherford wrote: > > On Mon, Aug 21, 2023 at 6:10 AM Tom Lendacky <thomas.lendacky@xxxxxxx> wrote: > >> > >> On 8/18/23 18:34, Steve Rutherford wrote: > >>> early_set_memory_decrypted() assumes its parameters are page aligned. > >>> Non-page aligned calls result in additional pages being marked as > >>> decrypted via the encryption status hypercall, which results in > >>> consistent corruption of pages during live migration. Live > >>> migration requires accurate encryption status information to avoid > >>> migrating pages from the wrong perspective. > >> > >> Hmmm... I'm not sure this is the proper fix. The code is actually doing > >> the right thing from a encyrption/decryption point of view by checking the > >> c-bit for the PTE associated with the virtual address and the size > >> (possibly crossing page boundaries). > >> > >> I think the problem is on the call to early_set_mem_enc_dec_hypercall() > >> where it doesn't take into account the possible crossing of page > >> boundaries and so can under-count the number of pages, right? > > > > Right now, if you request decryption of e.g. a non-page aligned 0x40 > > byte structure, it rounds the 0x40 bytes up to one page, and then > > hypercalls to mark both the page it's on and the subsequent page as > > decrypted (since the rounding stretches the structure onto the next > > page spuriously). The arithmetic in the combination of > > early_set_memory_enc_dec() and early_set_mem_enc_dec_hypercall() are > > correct if they are called with page aligned vaddrs (non-page-aligned > > sizes are fine iiuc). > > Ah, right, correct. It is still related to how the page count is > calculated for the hypercall, though, right? The encryption/decryption > operations function properly. Yep! It's just the hypercall that behaves poorly in this situation. > > If another caller of early_set_memory_decrypted() gets added, it would > need to know to do the same thing. So I just wonder if this wouldn't be > better fixed in early_set_memory_enc_dec() by using a page aligned address > and proper number of pages when calling early_set_mem_enc_dec_hypercall() > or in early_set_mem_enc_dec_hypercall() where it would take a size > argument instead of a page count and does the proper work to get a page > aligned address and proper page count. > > Also, if it is the hypercall that is causing the issue, should the Fixes > tag be 064ce6c550a0 ("mm: x86: Invoke hypercall when page encryption > status is changed") since the problem is around the hypercall. Fair question. I was torn about where to point this, since either fixing up the value inside early_set_memory_enc_dec() or fixing up the per-cpu callers is correct. The non-early version (__set_memory_enc_pgtable()) calls WARN_ONCE for misaligned addresses under the hood, so I thought the early version should have the same contract (though, obviously, this lacks the actual WARN_ONCE). I can re-upload with a WARN_ONCE or with the masking moved into early_set_memory_enc_dec(). Thanks, Steve > > Thanks, > Tom > > > > > Thanks, > > Steve > >> > >> Thanks, > >> Tom > >> > >>> > >>> Fixes: 4716276184ec ("X86/KVM: Decrypt shared per-cpu variables when SEV is active") > >>> Signed-off-by: Steve Rutherford <srutherford@xxxxxxxxxx> > >>> --- > >>> arch/x86/kernel/kvm.c | 14 +++++++++++++- > >>> 1 file changed, 13 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > >>> index 6a36db4f79fd..a0c072d3103c 100644 > >>> --- a/arch/x86/kernel/kvm.c > >>> +++ b/arch/x86/kernel/kvm.c > >>> @@ -419,7 +419,14 @@ static u64 kvm_steal_clock(int cpu) > >>> > >>> static inline void __set_percpu_decrypted(void *ptr, unsigned long size) > >>> { > >>> - early_set_memory_decrypted((unsigned long) ptr, size); > >>> + /* > >>> + * early_set_memory_decrypted() requires page aligned parameters, but > >>> + * this function needs to handle ptrs offset into a page. > >>> + */ > >>> + unsigned long start = PAGE_ALIGN_DOWN((unsigned long) ptr); > >>> + unsigned long end = (unsigned long) ptr + size; > >>> + > >>> + early_set_memory_decrypted(start, end - start); > >>> } > >>> > >>> /* > >>> @@ -438,6 +445,11 @@ static void __init sev_map_percpu_data(void) > >>> return; > >>> > >>> for_each_possible_cpu(cpu) { > >>> + /* > >>> + * Calling __set_percpu_decrypted() for each per-cpu variable is > >>> + * inefficent, since it may decrypt the same page multiple times. > >>> + * That said, it avoids the need for more complicated logic. > >>> + */ > >>> __set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason)); > >>> __set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time)); > >>> __set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));