On Fri, May 14, 2021 at 11:34:32AM +0200, Borislav Petkov wrote: > On Fri, May 14, 2021 at 09:05:23AM +0000, Ashish Kalra wrote: > > Ideally we should fail/stop migration even if a single guest page > > encryption status cannot be notified and that should be the way to > > proceed in this case, the guest kernel should notify the source > > userspace VMM to block/stop migration in this case. > > Yap, and what I'm trying to point out here is that if the low level > machinery fails for whatever reason and it cannot recover, we should > propagate that error up the chain so that the user is aware as to why it > failed. > I totally agree. > WARN is a good first start but in some configurations those splats don't > even get shown as people don't look at dmesg, etc. > > And I think it is very important to propagate those errors properly > because there's a *lot* of moving parts involved in a guest migration > and you have encrypted memory which makes debugging this probably even > harder, etc, etc. > > I hope this makes more sense. > > > From a practical side, i do see Qemu's migrate_add_blocker() interface > > but that looks to be a static interface and also i don't think it will > > force stop an ongoing migration, is there an existing mechanism > > to inform userspace VMM from kernel about blocking/stopping migration ? > > Hmm, so __set_memory_enc_dec() which calls > notify_addr_enc_status_changed() is called by the guest, right, when it > starts migrating. > No, actually notify_addr_enc_status_changed() is called whenever a range of memory is marked as encrypted or decrypted, so it has nothing to do with migration as such. This is basically modifying the encryption attributes on the page tables and correspondingly also making the hypercall to inform the hypervisor about page status encryption changes. The hypervisor will use this information during an ongoing or future migration, so this information is maintained even though migration might never be initiated here. > Can an error value from it be propagated up the callchain so it can be > turned into an error messsage for the guest owner to see? > The error value cannot be propogated up the callchain directly here, but one possibility is to leverage the hypercall and use Sean's proposed hypercall interface to notify the host/hypervisor to block/stop any future/ongoing migration. Or as from Paolo's response, writing 0 to MIGRATION_CONTROL MSR seems more ideal. Thanks, Ashish