Re: [PATCH v1 02/13] KVM: s390: fake memslots for ucontrol VMs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 10 Jan 2025 08:22:12 -0800
Sean Christopherson <seanjc@xxxxxxxxxx> wrote:

> On Fri, Jan 10, 2025, Claudio Imbrenda wrote:
> > On Fri, 10 Jan 2025 10:31:38 +0100
> > Christian Borntraeger <borntraeger@xxxxxxxxxx> wrote:
> >   
> > > Am 08.01.25 um 19:14 schrieb Claudio Imbrenda:  
> > > > +static void kvm_s390_ucontrol_ensure_memslot(struct kvm *kvm, unsigned long addr)
> > > > +{
> > > > +	struct kvm_userspace_memory_region2 region = {
> > > > +		.slot = addr / UCONTROL_SLOT_SIZE,
> > > > +		.memory_size = UCONTROL_SLOT_SIZE,
> > > > +		.guest_phys_addr = ALIGN_DOWN(addr, UCONTROL_SLOT_SIZE),
> > > > +		.userspace_addr = ALIGN_DOWN(addr, UCONTROL_SLOT_SIZE),
> > > > +	};
> > > > +	struct kvm_memory_slot *slot;
> > > > +
> > > > +	mutex_lock(&kvm->slots_lock);
> > > > +	slot = gfn_to_memslot(kvm, addr);
> > > > +	if (!slot)
> > > > +		__kvm_set_memory_region(kvm, &region);  
> 
> The return value definitely should be checked, especially if the memory regions
> are not KVM-internal, i.e. if userspace is allowed to create memslots.
> 

will fix, unless we do what you propose below

> > > > +	mutex_unlock(&kvm->slots_lock);
> > > > +}
> > > > +    
> > > 
> > > Would simply having one slot from 0 to TASK_SIZE also work? This could avoid the
> > > construction of the fake slots during runtime.  
> > 
> > unfortunately memslots are limited to 4TiB.
> > having bigger ones would require even more changes all across KVM (and
> > maybe qemu too)  
> 
> AFAIK, that limitation exists purely because of dirty bitmaps.  IIUC, these "fake"
> memslots are not intended to be visible to userspace, or at the very least don't
> *need* to be visible to userspace.
> 
> Assuming that's true, they/it can/should be KVM-internal memslots, and those
> should never be dirty-logged.  x86 allocates metadata based on slot size, so in
> practice creating a mega-slot will never succeed on x86, but the only size
> limitation I see in s390 is on arch.mem_limit, but for ucontrol that's set to -1ull,
> i.e. is a non-issue.
> 
> I have a series (that I need to refresh) to provide a dedicated API for creating
> internal memslots, and to also enforce that flags == 0 for internal memslots,
> i.e. to enforce that dirty logging is never enabled (see Link below).  With that
> I mind, I can't think of any reason to disallow a 0 => TASK_SIZE memslot so long
> as it's KVM-defined.
> 
> Using a single memslot would hopefully allow s390 to unconditionally carve out a
> KVM-internal memslot, i.e. not have to condition the logic on the type of VM.  E.g.

yes, I would love that

the reason why I did not use internal memslots is that I would have
potentially needed *all* the memslots for ucontrol, and instead of
reserving, say, half of all memslots, I decided to have them
user-visible, which is hack I honestly don't like.

do you think you can refresh the series before the upcoming merge
window?

otherwise I should split this series in two, since page->index needs to
be removed asap.

> 
>   #define KVM_INTERNAL_MEM_SLOTS 1
> 
>   #define KVM_S390_UCONTROL_MEMSLOT (KVM_USER_MEM_SLOTS + 0)
> 
> And then I think just this?
> 
> ---
> From: Sean Christopherson <seanjc@xxxxxxxxxx>
> Date: Fri, 10 Jan 2025 08:05:09 -0800
> Subject: [PATCH] KVM: Do not restrict the size of KVM-internal memory regions
> 
> Exempt KVM-internal memslots from the KVM_MEM_MAX_NR_PAGES restriction, as
> the limit on the number of pages exists purely to play nice with dirty
> bitmap operations, which use 32-bit values to index the bitmaps, and dirty
> logging isn't supported for KVM-internal memslots.
> 
> Link: https://lore.kernel.org/all/20240802205003.353672-6-seanjc@xxxxxxxxxx
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
>  virt/kvm/kvm_main.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 8a0d0d37fb17..3cea406c34db 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1972,7 +1972,15 @@ int __kvm_set_memory_region(struct kvm *kvm,
>  		return -EINVAL;
>  	if (mem->guest_phys_addr + mem->memory_size < mem->guest_phys_addr)
>  		return -EINVAL;
> -	if ((mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES)
> +
> +	/*
> +	 * The size of userspace-defined memory regions is restricted in order
> +	 * to play nice with dirty bitmap operations, which are indexed with an
> +	 * "unsigned int".  KVM's internal memory regions don't support dirty
> +	 * logging, and so are exempt.
> +	 */
> +	if (id < KVM_USER_MEM_SLOTS &&
> +	    (mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES)
>  		return -EINVAL;
>  
>  	slots = __kvm_memslots(kvm, as_id);
> 
> base-commit: 1aadfba8419606d447d1961f25e2d312011ad45a





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux