Re: [PATCH v7 1/4] KVM: Implement dirty quota-based throttling of vcpus

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 18/11/22 12:56 am, Marc Zyngier wrote:
On Sun, 13 Nov 2022 17:05:06 +0000,
Shivam Kumar <shivam.kumar1@xxxxxxxxxxx> wrote:

+    count: the current count of pages dirtied by the VCPU, can be
+    skewed based on the size of the pages accessed by each vCPU.

How can userspace make a decision on the amount of dirtying this
represent if this doesn't represent a number of base pages? Or are you
saying that this only counts the number of permission faults that have
dirtied pages?

Yes, this only counts the number of permission faults that have dirtied pages.


+    quota: the observed dirty quota just before the exit to
userspace.

You are defining the quota in terms of quota. -ENOCLUE.

I am defining the "quota" member of the dirty_quota_exit struct in terms of "dirty quota" which is already defined in the commit message.


+
+The userspace can design a strategy to allocate the overall scope of dirtying
+for the VM among the vcpus. Based on the strategy and the current state of dirty
+quota throttling, the userspace can make a decision to either update (increase)
+the quota or to put the VCPU to sleep for some time.

This looks like something out of 1984 (Newspeak anyone)? Can't you
just say that userspace is responsible for allocating the quota and
manage the resulting throttling effect?

We didn't intend to sound like the Party or the Big Brother. We started working on the linux and QEMU patches at the same time and got tempted into exposing the details of how we were using this feature in QEMU for throttling. I can get rid of the details if it helps.

+	/*
+	 * Number of pages the vCPU is allowed to have dirtied over its entire
+	 * lifetime.  KVM_RUN exits with KVM_EXIT_DIRTY_QUOTA_EXHAUSTED if the quota
+	 * is reached/exceeded.
+	 */
+	__u64 dirty_quota;

How are dirty_quota and dirty_quota_exit.quota related?


dirty_quota_exit.quota is the dirty quota at the time of the exit. We are capturing it for userspace's reference because dirty quota can be updated anytime.

@@ -48,6 +48,7 @@ config KVM
  	select KVM_VFIO
  	select SRCU
  	select INTERVAL_TREE
+	select HAVE_KVM_DIRTY_QUOTA

Why isn't this part of the x86 patch?

Ack. Thanks.

   * Architecture-independent vcpu->requests bit members
- * Bits 3-7 are reserved for more arch-independent bits.
+ * Bits 5-7 are reserved for more arch-independent bits.
   */
  #define KVM_REQ_TLB_FLUSH         (0 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
  #define KVM_REQ_VM_DEAD           (1 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
  #define KVM_REQ_UNBLOCK           2
+#define KVM_REQ_DIRTY_QUOTA_EXIT  4

Where is 3? Why reserve two bits when only one is used?

Ack. 3 was in use when I was working on the patchset. Missed this in my last code walkthrough before sending the patchset. Thanks.

+static bool kvm_vcpu_is_dirty_quota_exhausted(struct kvm_vcpu *vcpu)
+{
+#ifdef CONFIG_HAVE_KVM_DIRTY_QUOTA
+	u64 dirty_quota = READ_ONCE(vcpu->run->dirty_quota);
+
+	return dirty_quota && (vcpu->stat.generic.pages_dirtied >= dirty_quota);
+#else
+	return false;
+#endif

If you introduce additional #ifdefery here, why are the additional
fields in the vcpu structure unconditional?

pages_dirtied can be a useful information even if dirty quota throttling is not used. So, I kept it unconditional based on feedback.

CC: Sean

I can add #ifdefery in the vcpu run struct for dirty_quota.

  		else
  			set_bit_le(rel_gfn, memslot->dirty_bitmap);
+
+		if (kvm_vcpu_is_dirty_quota_exhausted(vcpu))
+			kvm_make_request(KVM_REQ_DIRTY_QUOTA_EXIT, vcpu);

This is broken in the light of new dirty-tracking code queued for
6.2. Specifically, you absolutely can end-up here *without* a vcpu on
arm64. You just have to snapshot the ITS state to observe the fireworks.

Could you please point me to the patchset which is in queue?


I am grateful for the suggestions and feedback.

Thanks,
Shivam



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux