On Thu, Apr 9, 2020 at 6:23 PM Ashish Kalra <ashish.kalra@xxxxxxx> wrote: > > Hello Steve, > > On Thu, Apr 09, 2020 at 05:06:21PM -0700, Steve Rutherford wrote: > > On Tue, Apr 7, 2020 at 6:49 PM Ashish Kalra <ashish.kalra@xxxxxxx> wrote: > > > > > > Hello Steve, > > > > > > On Tue, Apr 07, 2020 at 05:26:33PM -0700, Steve Rutherford wrote: > > > > On Sun, Mar 29, 2020 at 11:23 PM Ashish Kalra <Ashish.Kalra@xxxxxxx> wrote: > > > > > > > > > > From: Brijesh Singh <Brijesh.Singh@xxxxxxx> > > > > > > > > > > The ioctl can be used to set page encryption bitmap for an > > > > > incoming guest. > > > > > > > > > > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > > > > > Cc: Ingo Molnar <mingo@xxxxxxxxxx> > > > > > Cc: "H. Peter Anvin" <hpa@xxxxxxxxx> > > > > > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > > > > Cc: "Radim Krčmář" <rkrcmar@xxxxxxxxxx> > > > > > Cc: Joerg Roedel <joro@xxxxxxxxxx> > > > > > Cc: Borislav Petkov <bp@xxxxxxx> > > > > > Cc: Tom Lendacky <thomas.lendacky@xxxxxxx> > > > > > Cc: x86@xxxxxxxxxx > > > > > Cc: kvm@xxxxxxxxxxxxxxx > > > > > Cc: linux-kernel@xxxxxxxxxxxxxxx > > > > > Signed-off-by: Brijesh Singh <brijesh.singh@xxxxxxx> > > > > > Signed-off-by: Ashish Kalra <ashish.kalra@xxxxxxx> > > > > > --- > > > > > Documentation/virt/kvm/api.rst | 22 +++++++++++++++++ > > > > > arch/x86/include/asm/kvm_host.h | 2 ++ > > > > > arch/x86/kvm/svm.c | 42 +++++++++++++++++++++++++++++++++ > > > > > arch/x86/kvm/x86.c | 12 ++++++++++ > > > > > include/uapi/linux/kvm.h | 1 + > > > > > 5 files changed, 79 insertions(+) > > > > > > > > > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > > > > > index 8ad800ebb54f..4d1004a154f6 100644 > > > > > --- a/Documentation/virt/kvm/api.rst > > > > > +++ b/Documentation/virt/kvm/api.rst > > > > > @@ -4675,6 +4675,28 @@ or shared. The bitmap can be used during the guest migration, if the page > > > > > is private then userspace need to use SEV migration commands to transmit > > > > > the page. > > > > > > > > > > +4.126 KVM_SET_PAGE_ENC_BITMAP (vm ioctl) > > > > > +--------------------------------------- > > > > > + > > > > > +:Capability: basic > > > > > +:Architectures: x86 > > > > > +:Type: vm ioctl > > > > > +:Parameters: struct kvm_page_enc_bitmap (in/out) > > > > > +:Returns: 0 on success, -1 on error > > > > > + > > > > > +/* for KVM_SET_PAGE_ENC_BITMAP */ > > > > > +struct kvm_page_enc_bitmap { > > > > > + __u64 start_gfn; > > > > > + __u64 num_pages; > > > > > + union { > > > > > + void __user *enc_bitmap; /* one bit per page */ > > > > > + __u64 padding2; > > > > > + }; > > > > > +}; > > > > > + > > > > > +During the guest live migration the outgoing guest exports its page encryption > > > > > +bitmap, the KVM_SET_PAGE_ENC_BITMAP can be used to build the page encryption > > > > > +bitmap for an incoming guest. > > > > > > > > > > 5. The kvm_run structure > > > > > ======================== > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > > > > index 27e43e3ec9d8..d30f770aaaea 100644 > > > > > --- a/arch/x86/include/asm/kvm_host.h > > > > > +++ b/arch/x86/include/asm/kvm_host.h > > > > > @@ -1271,6 +1271,8 @@ struct kvm_x86_ops { > > > > > unsigned long sz, unsigned long mode); > > > > > int (*get_page_enc_bitmap)(struct kvm *kvm, > > > > > struct kvm_page_enc_bitmap *bmap); > > > > > + int (*set_page_enc_bitmap)(struct kvm *kvm, > > > > > + struct kvm_page_enc_bitmap *bmap); > > > > > }; > > > > > > > > > > struct kvm_arch_async_pf { > > > > > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > > > > > index bae783cd396a..313343a43045 100644 > > > > > --- a/arch/x86/kvm/svm.c > > > > > +++ b/arch/x86/kvm/svm.c > > > > > @@ -7756,6 +7756,47 @@ static int svm_get_page_enc_bitmap(struct kvm *kvm, > > > > > return ret; > > > > > } > > > > > > > > > > +static int svm_set_page_enc_bitmap(struct kvm *kvm, > > > > > + struct kvm_page_enc_bitmap *bmap) > > > > > +{ > > > > > + struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info; > > > > > + unsigned long gfn_start, gfn_end; > > > > > + unsigned long *bitmap; > > > > > + unsigned long sz, i; > > > > > + int ret; > > > > > + > > > > > + if (!sev_guest(kvm)) > > > > > + return -ENOTTY; > > > > > + > > > > > + gfn_start = bmap->start_gfn; > > > > > + gfn_end = gfn_start + bmap->num_pages; > > > > > + > > > > > + sz = ALIGN(bmap->num_pages, BITS_PER_LONG) / 8; > > > > > + bitmap = kmalloc(sz, GFP_KERNEL); > > > > > + if (!bitmap) > > > > > + return -ENOMEM; > > > > > + > > > > > + ret = -EFAULT; > > > > > + if (copy_from_user(bitmap, bmap->enc_bitmap, sz)) > > > > > + goto out; > > > > > + > > > > > + mutex_lock(&kvm->lock); > > > > > + ret = sev_resize_page_enc_bitmap(kvm, gfn_end); > > > > I realize now that usermode could use this for initializing the > > > > minimum size of the enc bitmap, which probably solves my issue from > > > > the other thread. > > > > > + if (ret) > > > > > + goto unlock; > > > > > + > > > > > + i = gfn_start; > > > > > + for_each_clear_bit_from(i, bitmap, (gfn_end - gfn_start)) > > > > > + clear_bit(i + gfn_start, sev->page_enc_bmap); > > > > This API seems a bit strange, since it can only clear bits. I would > > > > expect "set" to force the values to match the values passed down, > > > > instead of only ensuring that cleared bits in the input are also > > > > cleared in the kernel. > > > > > > > > > > The sev_resize_page_enc_bitmap() will allocate a new bitmap and > > > set it to all 0xFF's, therefore, the code here simply clears the bits > > > in the bitmap as per the cleared bits in the input. > > > > If I'm not mistaken, resize only reinitializes the newly extended part > > of the buffer, and copies the old values for the rest. > > With the API you proposed you could probably reimplement a normal set > > call by calling get, then reset, and then set, but this feels > > cumbersome. > > > > As i mentioned earlier, the set api is basically meant for the incoming > VM, the resize will initialize the incoming VM's bitmap to all 0xFF's > and as there won't be any bitmap allocated initially on the incoming VM, > therefore, the bitmap copy will not do anything and the clear_bit later > will clear the incoming VM's bits as per the input. The documentation does not make that super clear. A typical set call in the KVM API let's you go to any state, not just a subset of states. Yes, this works in the common case of migrating a VM to a particular target, once. I find the behavior of the current API surprising. I prefer APIs that are unsurprising. If I were to not have read the code, it would be very easy for me to have assumed it worked like a normal set call. You could rename the ioctl something like "CLEAR_BITS", but a set based API is more common. Thanks, Steve