Re: [PATCH v4 1/4] KVM: Implement dirty quota-based throttling of vcpus

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 15, 2022 at 04:23:54PM +0000, Sean Christopherson wrote:
> On Thu, Jul 14, 2022, Peter Xu wrote:
> > On Thu, Jul 14, 2022 at 08:48:04PM +0000, Sean Christopherson wrote:
> > > On Thu, Jul 14, 2022, Peter Xu wrote:
> > > > Hi, Shivam,
> > > > 
> > > > On Tue, Jul 05, 2022 at 12:51:01PM +0530, Shivam Kumar wrote:
> > > > > Hi, here's a summary of what needs to be changed and what should be kept as
> > > > > it is (purely my opinion based on the discussions we have had so far):
> > > > > 
> > > > > i) Moving the dirty quota check to mark_page_dirty_in_slot. Use kvm requests
> > > > > in dirty quota check. I hope that the ceiling-based approach, with proper
> > > > > documentation and an ioctl exposed for resetting 'dirty_quota' and
> > > > > 'pages_dirtied', is good enough. Please post your suggestions if you think
> > > > > otherwise.
> > > > 
> > > > An ioctl just for this could be an overkill to me.
> > > >
> > > > Currently you exposes only "quota" to kvm_run, then when vmexit you have
> > > > exit fields contain both "quota" and "count".  I always think it's a bit
> > > > redundant.
> > > > 
> > > > What I'm thinking is:
> > > > 
> > > >   (1) Expose both "quota" and "count" in kvm_run, then:
> > > > 
> > > >       "quota" should only be written by userspace and read by kernel.
> > > >       "count" should only be written by kernel and read by the userspace. [*]
> > > > 
> > > >       [*] One special case is when the userspace found that there's risk of
> > > >       quota & count overflow, then the userspace:
> > > > 
> > > >         - Kick the vcpu out (so the kernel won't write to "count" anymore)
> > > >         - Update both "quota" and "count" to safe values
> > > >         - Resume the KVM_RUN
> > > > 
> > > >   (2) When quota reached, we don't need to copy quota/count in vmexit
> > > >       fields, since the userspace can read the realtime values in kvm_run.
> > > > 
> > > > Would this work?
> > > 
> > > Technically, yes, practically speaking, no.  If KVM doesn't provide the quota
> > > that _KVM_ saw at the time of exit, then there's no sane way to audit KVM exits
> > > due to KVM_EXIT_DIRTY_QUOTA_EXHAUSTED.  Providing the quota ensure userspace sees
> > > sane, coherent data if there's a race between KVM checking the quota and userspace
> > > updating the quota.  If KVM doesn't provide the quota, then userspace can see an
> > > exit with "count < quota".
> > 
> > This is rare false positive which should be acceptable in this case (the
> > same as vmexit with count==quota but we just planned to boost the quota),
> > IMHO it's better than always kicking the vcpu, since the overhead for such
> > false is only a vmexit but nothing else.
> 
> Oh, we're in complete agreement on that front.  I'm only objecting to forcing
> userspace to read the realtime quota+count.  I want KVM to provide a snapshot of
> the quota+count so that if there's a KVM bug, e.g. KVM spuriously exits, then
> there is zero ambiguity as the quota+count in the kvm_run exit field will hold
> invalid/garbage data.

That'll need to be a super accident of both: an accident of spurious exit,
and another accident to set vm exit reason exactly to QUOTA_FULL. :) But I
see what you meant.

> Without a snapshot, if there were a bug where KVM spuriously
> exited, root causing or even detecting the bug would be difficult if userspace is
> dynamically updating the quota as changing the quota would have destroyed the
> evidence of KVM's bug.
> 
> It's unlikely we'll eever have such a bug, but filling the exits fields is cheap, and
> because it's a union, the "redundant" fields don't consume extra space in kvm_run.

Yes no objection if you think that's better, the overhead is pretty trivial
indeed.

> 
> And the reasoning behind not having kvm_run.dirty_count is that it's fully
> redundant if KVM provides a stat, and IMO such a stat will be quite helpful for
> things beyond dirty quotas, e.g. being able to see which vCPUs are dirtying memory
> from the command line for debug purposes.

Not if with overflow in mind?  Note that I totally agree the overflow may
not even happen, but I think it makes sense to consider as a complete
design of ceiling-based approach.  Think the Millennium bug, we never know
what will happen in the future..

So no objection too on having stats for dirty pages, it's just that if we
still want to cover the overflow issue we'd make dirty_count writable, then
it'd still better be in kvm_run, imho.

-- 
Peter Xu




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux