Re: [PATCH v7 1/4] KVM: Implement dirty quota-based throttling of vcpus

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 18 Nov 2022 09:47:50 +0000,
Shivam Kumar <shivam.kumar1@xxxxxxxxxxx> wrote:
> 
> 
> 
> On 18/11/22 12:56 am, Marc Zyngier wrote:
> > On Sun, 13 Nov 2022 17:05:06 +0000,
> > Shivam Kumar <shivam.kumar1@xxxxxxxxxxx> wrote:
> >> 
> >> +    count: the current count of pages dirtied by the VCPU, can be
> >> +    skewed based on the size of the pages accessed by each vCPU.
> > 
> > How can userspace make a decision on the amount of dirtying this
> > represent if this doesn't represent a number of base pages? Or are you
> > saying that this only counts the number of permission faults that have
> > dirtied pages?
> 
> Yes, this only counts the number of permission faults that have
> dirtied pages.

So how can userspace consistently set a quota of dirtied memory? This
has to account for the size that has been faulted, because that's all
userspace can reason about. Remember that at least on arm64, we're
dealing with 3 different base page sizes, and many more large page
sizes.

> 
> > 
> >> +    quota: the observed dirty quota just before the exit to
> >> userspace.
> > 
> > You are defining the quota in terms of quota. -ENOCLUE.
> 
> I am defining the "quota" member of the dirty_quota_exit struct in
> terms of "dirty quota" which is already defined in the commit
> message.

Which nobody will see. This is supposed to be a self contained
documentation.

> 
> > 
> >> +
> >> +The userspace can design a strategy to allocate the overall scope of dirtying
> >> +for the VM among the vcpus. Based on the strategy and the current state of dirty
> >> +quota throttling, the userspace can make a decision to either update (increase)
> >> +the quota or to put the VCPU to sleep for some time.
> > 
> > This looks like something out of 1984 (Newspeak anyone)? Can't you
> > just say that userspace is responsible for allocating the quota and
> > manage the resulting throttling effect?
> 
> We didn't intend to sound like the Party or the Big Brother. We
> started working on the linux and QEMU patches at the same time and got
> tempted into exposing the details of how we were using this feature in
> QEMU for throttling. I can get rid of the details if it helps.

I think the details are meaningless, and this should stick to the API,
not the way the API could be used.

> 
> >> +	/*
> >> +	 * Number of pages the vCPU is allowed to have dirtied over its entire
> >> +	 * lifetime.  KVM_RUN exits with KVM_EXIT_DIRTY_QUOTA_EXHAUSTED if the quota
> >> +	 * is reached/exceeded.
> >> +	 */
> >> +	__u64 dirty_quota;
> > 
> > How are dirty_quota and dirty_quota_exit.quota related?
> > 
> 
> dirty_quota_exit.quota is the dirty quota at the time of the exit. We
> are capturing it for userspace's reference because dirty quota can be
> updated anytime.

Shouldn't that be described here?

> 
> >> @@ -48,6 +48,7 @@ config KVM
> >>   	select KVM_VFIO
> >>   	select SRCU
> >>   	select INTERVAL_TREE
> >> +	select HAVE_KVM_DIRTY_QUOTA
> > 
> > Why isn't this part of the x86 patch?
> 
> Ack. Thanks.
> 
> >>    * Architecture-independent vcpu->requests bit members
> >> - * Bits 3-7 are reserved for more arch-independent bits.
> >> + * Bits 5-7 are reserved for more arch-independent bits.
> >>    */
> >>   #define KVM_REQ_TLB_FLUSH         (0 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> >>   #define KVM_REQ_VM_DEAD           (1 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> >>   #define KVM_REQ_UNBLOCK           2
> >> +#define KVM_REQ_DIRTY_QUOTA_EXIT  4
> > 
> > Where is 3? Why reserve two bits when only one is used?
> 
> Ack. 3 was in use when I was working on the patchset. Missed this in
> my last code walkthrough before sending the patchset. Thanks.
> 
> >>   +static bool kvm_vcpu_is_dirty_quota_exhausted(struct kvm_vcpu
> >> *vcpu)
> >> +{
> >> +#ifdef CONFIG_HAVE_KVM_DIRTY_QUOTA
> >> +	u64 dirty_quota = READ_ONCE(vcpu->run->dirty_quota);
> >> +
> >> +	return dirty_quota && (vcpu->stat.generic.pages_dirtied >= dirty_quota);
> >> +#else
> >> +	return false;
> >> +#endif
> > 
> > If you introduce additional #ifdefery here, why are the additional
> > fields in the vcpu structure unconditional?
> 
> pages_dirtied can be a useful information even if dirty quota
> throttling is not used. So, I kept it unconditional based on
> feedback.

Useful for whom? This creates an ABI for all architectures, and this
needs buy-in from everyone. Personally, I think it is a pretty useless
stat.

And while we're talking about pages_dirtied, I really dislike the
WARN_ON in mark_page_dirty_in_slot(). A counter has rolled over?
Shock, horror...

> 
> CC: Sean
> 
> I can add #ifdefery in the vcpu run struct for dirty_quota.
> 
> >>   		else
> >>   			set_bit_le(rel_gfn, memslot->dirty_bitmap);
> >> +
> >> +		if (kvm_vcpu_is_dirty_quota_exhausted(vcpu))
> >> +			kvm_make_request(KVM_REQ_DIRTY_QUOTA_EXIT, vcpu);
> > 
> > This is broken in the light of new dirty-tracking code queued for
> > 6.2. Specifically, you absolutely can end-up here *without* a vcpu on
> > arm64. You just have to snapshot the ITS state to observe the fireworks.
> 
> Could you please point me to the patchset which is in queue?

The patches are in -next, and you can look at the branch here[1].
Please also refer to the discussion on the list, as a lot of what was
discussed there does apply here.

Thanks,

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/dirty-ring

-- 
Without deviation from the norm, progress is not possible.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux