On Mon, Jun 29, 2020 at 07:23:30AM +0530, Bharata B Rao wrote: > On Sun, Jun 28, 2020 at 09:41:53PM +0530, Bharata B Rao wrote: > > On Fri, Jun 19, 2020 at 03:43:38PM -0700, Ram Pai wrote: > > > The time taken to switch a VM to Secure-VM, increases by the size of the VM. A > > > 100GB VM takes about 7minutes. This is unacceptable. This linear increase is > > > caused by a suboptimal behavior by the Ultravisor and the Hypervisor. The > > > Ultravisor unnecessarily migrates all the GFN of the VM from normal-memory to > > > secure-memory. It has to just migrate the necessary and sufficient GFNs. > > > > > > However when the optimization is incorporated in the Ultravisor, the Hypervisor > > > starts misbehaving. The Hypervisor has a inbuilt assumption that the Ultravisor > > > will explicitly request to migrate, each and every GFN of the VM. If only > > > necessary and sufficient GFNs are requested for migration, the Hypervisor > > > continues to manage the remaining GFNs as normal GFNs. This leads of memory > > > corruption, manifested consistently when the SVM reboots. > > > > > > The same is true, when a memory slot is hotplugged into a SVM. The Hypervisor > > > expects the ultravisor to request migration of all GFNs to secure-GFN. But at > > > the same time, the hypervisor is unable to handle any H_SVM_PAGE_IN requests > > > from the Ultravisor, done in the context of UV_REGISTER_MEM_SLOT ucall. This > > > problem manifests as random errors in the SVM, when a memory-slot is > > > hotplugged. > > > > > > This patch series automatically migrates the non-migrated pages of a SVM, > > > and thus solves the problem. > > > > So this is what I understand as the objective of this patchset: > > > > 1. Getting all the pages into the secure memory right when the guest > > transitions into secure mode is expensive. Ultravisor wants to just get > > the necessary and sufficient pages in and put the onus on the Hypervisor > > to mark the remaining pages (w/o actual page-in) as secure during > > H_SVM_INIT_DONE. > > 2. During H_SVM_INIT_DONE, you want a way to differentiate the pages that > > are already secure from the pages that are shared and that are paged-out. > > For this you are introducing all these new states in HV. > > > > UV knows about the shared GFNs and maintains the state of the same. Hence > > let HV send all the pages (minus already secured pages) via H_SVM_PAGE_IN > > and if UV finds any shared pages in them, let it fail the uv-page-in call. > > Then HV can fail the migration for it and the page continues to remain > > shared. With this, you don't need to maintain a state for secured GFN in HV. > > > > In the unlikely case of sending a paged-out page to UV during > > H_SVM_INIT_DONE, let the page-in succeed and HV will fault on it again > > if required. With this, you don't need a state in HV to identify a > > paged-out-but-encrypted state. > > > > Doesn't the above work? > > I see that you want to infact skip the uv-page-in calls from H_SVM_INIT_DONE. > So that would need the extra states in HV which you are proposing here. Yes. I want to skip to speed up the overall ESM switch. RP