On Thu, Jul 14, 2022 at 08:48:04PM +0000, Sean Christopherson wrote: > On Thu, Jul 14, 2022, Peter Xu wrote: > > Hi, Shivam, > > > > On Tue, Jul 05, 2022 at 12:51:01PM +0530, Shivam Kumar wrote: > > > Hi, here's a summary of what needs to be changed and what should be kept as > > > it is (purely my opinion based on the discussions we have had so far): > > > > > > i) Moving the dirty quota check to mark_page_dirty_in_slot. Use kvm requests > > > in dirty quota check. I hope that the ceiling-based approach, with proper > > > documentation and an ioctl exposed for resetting 'dirty_quota' and > > > 'pages_dirtied', is good enough. Please post your suggestions if you think > > > otherwise. > > > > An ioctl just for this could be an overkill to me. > > > > Currently you exposes only "quota" to kvm_run, then when vmexit you have > > exit fields contain both "quota" and "count". I always think it's a bit > > redundant. > > > > What I'm thinking is: > > > > (1) Expose both "quota" and "count" in kvm_run, then: > > > > "quota" should only be written by userspace and read by kernel. > > "count" should only be written by kernel and read by the userspace. [*] > > > > [*] One special case is when the userspace found that there's risk of > > quota & count overflow, then the userspace: > > > > - Kick the vcpu out (so the kernel won't write to "count" anymore) > > - Update both "quota" and "count" to safe values > > - Resume the KVM_RUN > > > > (2) When quota reached, we don't need to copy quota/count in vmexit > > fields, since the userspace can read the realtime values in kvm_run. > > > > Would this work? > > Technically, yes, practically speaking, no. If KVM doesn't provide the quota > that _KVM_ saw at the time of exit, then there's no sane way to audit KVM exits > due to KVM_EXIT_DIRTY_QUOTA_EXHAUSTED. Providing the quota ensure userspace sees > sane, coherent data if there's a race between KVM checking the quota and userspace > updating the quota. If KVM doesn't provide the quota, then userspace can see an > exit with "count < quota". This is rare false positive which should be acceptable in this case (the same as vmexit with count==quota but we just planned to boost the quota), IMHO it's better than always kicking the vcpu, since the overhead for such false is only a vmexit but nothing else. > > Even if userspace is ok with such races, it will be extremely difficult to detect > KVM issues if we mess something up because such behavior would have to be allowed > by KVM's ABI. Could you elaborate? We have quite a few places sharing these between user/kernel on kvm_run, no? > > > > ii) The change in VMX's handle_invalid_guest_state() remains as it is. > > > iii) For now, we are lazily updating dirty quota, i.e. we are updating it > > > only when the vcpu exits to userspace with the exit reason > > > KVM_EXIT_DIRTY_QUOTA_EXHAUSTED. > > > > At least with above design, IMHO we can update kvm_run.quota in real time > > without kicking vcpu threads (the quota only increases except the reset > > phase for overflow). Or is there any other reason you need to kick vcpu > > out when updating quota? > > I'm not convinced overflow is actually possible. Yeah from that part I second you. I don't really think it'll even happen in practise, but I'd still like to make sure we won't introduce an ioctl for the overflow only. -- Peter Xu