On Sun, Nov 13, 2022 at 05:05:06PM +0000, Shivam Kumar wrote: > Define variables to track and throttle memory dirtying for every vcpu. > > dirty_count: Number of pages the vcpu has dirtied since its creation, > while dirty logging is enabled. > dirty_quota: Number of pages the vcpu is allowed to dirty. To dirty > more, it needs to request more quota by exiting to > userspace. > > Implement the flow for throttling based on dirty quota. > > i) Increment dirty_count for the vcpu whenever it dirties a page. > ii) Exit to userspace whenever the dirty quota is exhausted (i.e. dirty > count equals/exceeds dirty quota) to request more dirty quota. > > Suggested-by: Shaju Abraham <shaju.abraham@xxxxxxxxxxx> > Suggested-by: Manish Mishra <manish.mishra@xxxxxxxxxxx> > Co-developed-by: Anurag Madnawat <anurag.madnawat@xxxxxxxxxxx> > Signed-off-by: Anurag Madnawat <anurag.madnawat@xxxxxxxxxxx> > Signed-off-by: Shivam Kumar <shivam.kumar1@xxxxxxxxxxx> > --- > Documentation/virt/kvm/api.rst | 35 ++++++++++++++++++++++++++++++++++ > arch/x86/kvm/Kconfig | 1 + > include/linux/kvm_host.h | 5 ++++- > include/linux/kvm_types.h | 1 + > include/uapi/linux/kvm.h | 13 +++++++++++++ > tools/include/uapi/linux/kvm.h | 1 + > virt/kvm/Kconfig | 4 ++++ > virt/kvm/kvm_main.c | 25 +++++++++++++++++++++--- > 8 files changed, 81 insertions(+), 4 deletions(-) > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > index eee9f857a986..4568faa33c6d 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -6513,6 +6513,26 @@ array field represents return values. The userspace should update the return > values of SBI call before resuming the VCPU. For more details on RISC-V SBI > spec refer, https://github.com/riscv/riscv-sbi-doc. > > +:: > + > + /* KVM_EXIT_DIRTY_QUOTA_EXHAUSTED */ > + struct { > + __u64 count; > + __u64 quota; > + } dirty_quota_exit; > + > +If exit reason is KVM_EXIT_DIRTY_QUOTA_EXHAUSTED, it indicates that the VCPU has > +exhausted its dirty quota. The 'dirty_quota_exit' member of kvm_run structure > +makes the following information available to the userspace: > + count: the current count of pages dirtied by the VCPU, can be > + skewed based on the size of the pages accessed by each vCPU. > + quota: the observed dirty quota just before the exit to userspace. > + > +The userspace can design a strategy to allocate the overall scope of dirtying > +for the VM among the vcpus. Based on the strategy and the current state of dirty > +quota throttling, the userspace can make a decision to either update (increase) > +the quota or to put the VCPU to sleep for some time. > + > :: > > /* KVM_EXIT_NOTIFY */ > @@ -6567,6 +6587,21 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set. > > :: > > + /* > + * Number of pages the vCPU is allowed to have dirtied over its entire > + * lifetime. KVM_RUN exits with KVM_EXIT_DIRTY_QUOTA_EXHAUSTED if the quota > + * is reached/exceeded. > + */ > + __u64 dirty_quota; > + > +Please note that enforcing the quota is best effort, as the guest may dirty > +multiple pages before KVM can recheck the quota. However, unless KVM is using > +a hardware-based dirty ring buffer, e.g. Intel's Page Modification Logging, > +KVM will detect quota exhaustion within a handful of dirtied pages. If a > +hardware ring buffer is used, the overrun is bounded by the size of the buffer > +(512 entries for PML). > + > +:: > }; > > > diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig > index 67be7f217e37..bdbd36321d52 100644 > --- a/arch/x86/kvm/Kconfig > +++ b/arch/x86/kvm/Kconfig > @@ -48,6 +48,7 @@ config KVM > select KVM_VFIO > select SRCU > select INTERVAL_TREE > + select HAVE_KVM_DIRTY_QUOTA > select HAVE_KVM_PM_NOTIFIER if PM > help > Support hosting fully virtualized guest machines using hardware > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 18592bdf4c1b..0b9b5c251a04 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -151,11 +151,12 @@ static inline bool is_error_page(struct page *page) > #define KVM_REQUEST_NO_ACTION BIT(10) > /* > * Architecture-independent vcpu->requests bit members > - * Bits 3-7 are reserved for more arch-independent bits. > + * Bits 5-7 are reserved for more arch-independent bits. > */ > #define KVM_REQ_TLB_FLUSH (0 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) > #define KVM_REQ_VM_DEAD (1 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) > #define KVM_REQ_UNBLOCK 2 > +#define KVM_REQ_DIRTY_QUOTA_EXIT 4 Sorry if I missed anything. Why it's 4 instead of 3?