This patchset is the KVM-side implementation of a (new) dirty "quota" based throttling algorithm that selectively throttles vCPUs based on their individual contribution to overall memory dirtying and also dynamically adapts the throttle based on the available network bandwidth. Overview ---------- ---------- To throttle memory dirtying, we propose to set a limit on the number of pages a vCPU can dirty in given fixed microscopic size time intervals. This limit depends on the network throughput calculated over the last few intervals so as to throttle the vCPUs based on available network bandwidth. We are referring to this limit as the "dirty quota" of a vCPU and the fixed size intervals as the "dirty quota intervals". One possible approach to distributing the overall scope of dirtying for a dirty quota interval is to equally distribute it among all the vCPUs. This approach to the distribution doesn't make sense if the distribution of workloads among vCPUs is skewed. So, to counter such skewed cases, we propose that if any vCPU doesn't need its quota for any given dirty quota interval, we add this quota to a common pool. This common pool (or "common quota") can be consumed on a first come first serve basis by all vCPUs in the upcoming dirty quota intervals. Design ---------- ---------- Initialization vCPUDirtyQuotaContext keeps the dirty quota context for each vCPU. It keeps the number of pages the vCPU has dirtied (dirty_counter) in the ongoing dirty quota interval, and the maximum number of dirties allowed for the vCPU (dirty_quota) in the ongoing dirty quota interval. struct vCPUDirtyQuotaContext { u64 dirty_counter; u64 dirty_quota; }; The flag dirty_quota_migration_enabled determines whether dirty quota-based throttling is enabled for an ongoing migration or not. Handling page dirtying When the guest tries to dirty a page, it leads to a vmexit as each page is write-protected. In the vmexit path, we increment the dirty_counter for the corresponding vCPU. Then, we check if the vCPU has exceeded its quota. If yes, we exit to userspace with a new exit reason KVM_EXIT_DIRTY_QUOTA_FULL. This "quota full" event is further handled on the userspace side. Please find the KVM Forum presentation on dirty quota-based throttling here: https://www.youtube.com/watch?v=ZBkkJf78zFA Shivam Kumar (6): Define data structures for dirty quota migration. Init dirty quota flag and allocate memory for vCPUdqctx. Add KVM_CAP_DIRTY_QUOTA_MIGRATION and handle vCPU page faults. Increment dirty counter for vmexit due to page write fault. Exit to userspace when dirty quota is full. Free vCPUdqctx memory on vCPU destroy. Documentation/virt/kvm/api.rst | 39 +++++++++++++++++++ arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/Makefile | 3 +- arch/x86/kvm/x86.c | 9 +++++ include/linux/dirty_quota_migration.h | 52 +++++++++++++++++++++++++ include/linux/kvm_host.h | 3 ++ include/uapi/linux/kvm.h | 11 ++++++ virt/kvm/dirty_quota_migration.c | 31 +++++++++++++++ virt/kvm/kvm_main.c | 56 ++++++++++++++++++++++++++- 9 files changed, 203 insertions(+), 2 deletions(-) create mode 100644 include/linux/dirty_quota_migration.h create mode 100644 virt/kvm/dirty_quota_migration.c -- 2.22.3