Re: [PATCH v10 1/3] KVM: Implement dirty quota-based throttling of vcpus

Anish Moorthy <amoorthy@xxxxxxxxxx> · Wed, 21 Feb 2024 18:00:16 -0800

I just saw this on the mailing list and had a couple minor thoughts,
apologies if I'm contradicting any of the feedback you've received on
previous versions

On Wed, Feb 21, 2024 at 12:01 PM Shivam Kumar <shivam.kumar1@xxxxxxxxxxx> wrote:
>
> Define dirty_quota_bytes variable to track and throttle memory
> dirtying for every vcpu. This variable stores the number of bytes the
> vcpu is allowed to dirty. To dirty more, the vcpu needs to request
> more quota by exiting to userspace.
>
> Implement update_dirty_quota function which

Tiny nit, but can we just rename this to "reduce_dirty_quota"? It's
easy to see what an "update" is, but might as well make it even
clearer.

> +#ifdef CONFIG_HAVE_KVM_DIRTY_QUOTA
> +void update_dirty_quota(struct kvm *kvm, unsigned long page_size_bytes);
> +#else
> +static inline void update_dirty_quota(struct kvm *kvm, unsigned long page_size_bytes)
> +{
> +}
> +#endif

Is there a reason to #ifdef like this instead of just having a single
definition and doing

> void update_dirty_quota(,,,) {
>     if (!IS_ENABLED(CONFIG_HAVE_KVM_DIRTY_QUOTA)) return;
>     // actual body here
> }

in the body? I figure the compiler elides the no-op call, though I've
never bothered to check...

> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 10bfc88a69f7..9a1e67187735 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3626,6 +3626,19 @@ int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len)
>  }
>  EXPORT_SYMBOL_GPL(kvm_clear_guest);
>
> +void update_dirty_quota(struct kvm *kvm, unsigned long page_size_bytes)
> +{
> +       struct kvm_vcpu *vcpu = kvm_get_running_vcpu();

Can we just make update_dirty_quota() take a kvm_vcpu* instead of a
kvm* as its first parameter? Since the quota is per-vcpu, that seems
to make sense, and most of the callers of this function look like

> update_dirty_quota(vcpu->kvm, some_size_here);

anyways. The only one that's not is the addition in mark_page_dirty()

>  void mark_page_dirty_in_slot(struct kvm *kvm,
>                              const struct kvm_memory_slot *memslot,
>                              gfn_t gfn)
> @@ -3656,6 +3669,7 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
>         struct kvm_memory_slot *memslot;
>
>         memslot = gfn_to_memslot(kvm, gfn);
> +       update_dirty_quota(kvm, PAGE_SIZE);
>         mark_page_dirty_in_slot(kvm, memslot, gfn);
>  }

Is mark_page_dirty() allowed to be used outside of a vCPU context? The
lack of a vcpu* makes me think it is- I assume we don't want to charge
vCPUs for accesses they're not making.

Unfortunately we do seem to use it *in* vCPU contexts (see
kvm_update_stolen_time() on arm64?), although not on x86 AFAICT.