On Sat, Oct 22, 2022, Gavin Shan wrote: > > > When dirty ring becomes full, the VCPU can't handle any operations, which will > > > bring more dirty pages. > > > > Right, but there's a buffer of 64 entries on top of what the CPU can buffer (VMX's > > PML can buffer 512 entries). Hence the "soft full". If x86 is already on the > > edge of exhausting that buffer, i.e. can fill 64 entries while handling requests, > > than we need to increase the buffer provided by the soft limit because sooner or > > later KVM will be able to fill 65 entries, at which point errors will occur > > regardless of when the "soft full" request is processed. > > > > In other words, we can take advantage of the fact that the soft-limit buffer needs > > to be quite conservative. > > > > Right, there are extra 64 entries in the ring between soft full and hard full. > Another 512 entries are reserved when PML is enabled. However, the other requests, > who produce dirty pages, are producers to the ring. We can't just have the assumption > that those producers will need less than 64 entries. But we're already assuming those producers will need less than 65 entries. My point is that if one (or even five) extra entries pushes KVM over the limit, then the buffer provided by the soft limit needs to be jacked up regardless of when the request is processed. Hmm, but I suppose it's possible there's a pathological emulator path that can push double digit entries, and servicing the request right away ensures that requests have the full 64 entry buffer to play with. So yeah, I agree, move it below the DEAD check, but keep it above most everything else. > > > > Would it make sense to clear the request in kvm_dirty_ring_reset()? I don't care > > > > about the overhead of having to re-check the request, the goal would be to help > > > > document what causes the request to go away. > > > > > > > > E.g. modify kvm_dirty_ring_reset() to take @vcpu and then do: > > > > > > > > if (!kvm_dirty_ring_soft_full(ring)) > > > > kvm_clear_request(KVM_REQ_RING_SOFT_FULL, vcpu); > > > > > > > > > > It's reasonable to clear KVM_REQ_DIRTY_RING_SOFT_FULL when the ring is reseted. > > > @vcpu can be achieved by container_of(..., ring). > > > > Using container_of() is silly, there's literally one caller that does: > > > > kvm_for_each_vcpu(i, vcpu, kvm) > > cleared += kvm_dirty_ring_reset(vcpu->kvm, &vcpu->dirty_ring); > > > > May I ask why it's silly by using container_of()? Because container_of() is inherently dangerous, e.g. if it's used on a pointer that isn't contained by the expected type, the code will compile cleanly but explode at runtime. That's unlikely to happen in this case, e.g. doesn't look like we'll be adding a ring to "struct kvm", but if someone wanted to add a per-VM ring, taking the vCPU makes it very obvious that pushing to a ring _requires_ a vCPU, and enforces that requirement at compile time. In other words, it's preferable to avoid container_of() unless using it solves a real problem that doesn't have a better alternative. In these cases, passing in the vCPU is most definitely a better alternative as each of the functions in question has a sole caller that has easy access to the container (vCPU), i.e. it's a trivial change. > In order to avoid using container_of(), kvm_dirty_ring_push() also need > @vcpu. Yep, that one should be changed too. > So lets change those two functions to something like below. Please > double-check if they looks good to you? > > void kvm_dirty_ring_push(struct kvm_vcpu *vcpu, u32 slot, u64 offset); > int kvm_dirty_ring_reset(struct kvm_vcpu *vcpu); Yep, looks good.