On Tue, 23 Aug 2022 15:44:42 +0100, Oliver Upton <oliver.upton@xxxxxxxxx> wrote: > > On Mon, Aug 22, 2022 at 10:42:15PM +0100, Marc Zyngier wrote: > > Hi Gavin, > > > > On Mon, 22 Aug 2022 02:58:20 +0100, > > Gavin Shan <gshan@xxxxxxxxxx> wrote: > > > > > > Hi Marc, > > > > > > On 8/19/22 6:00 PM, Marc Zyngier wrote: > > > > On Fri, 19 Aug 2022 01:55:57 +0100, > > > > Gavin Shan <gshan@xxxxxxxxxx> wrote: > > > >> > > > >> The ring-based dirty memory tracking has been available and enabled > > > >> on x86 for a while. The feature is beneficial when the number of > > > >> dirty pages is small in a checkpointing system or live migration > > > >> scenario. More details can be found from fb04a1eddb1a ("KVM: X86: > > > >> Implement ring-based dirty memory tracking"). > > > >> > > > >> This enables the ring-based dirty memory tracking on ARM64. It's > > > >> notable that no extra reserved ring entries are needed on ARM64 > > > >> because the huge pages are always split into base pages when page > > > >> dirty tracking is enabled. > > > > > > > > Can you please elaborate on this? Adding a per-CPU ring of course > > > > results in extra memory allocation, so there must be a subtle > > > > x86-specific detail that I'm not aware of... > > > > > > > > > > Sure. I guess it's helpful to explain how it works in next revision. > > > Something like below: > > > > > > This enables the ring-based dirty memory tracking on ARM64. The feature > > > is enabled by CONFIG_HAVE_KVM_DIRTY_RING, detected and enabled by > > > CONFIG_HAVE_KVM_DIRTY_RING. A ring buffer is created on every vcpu and > > > each entry is described by 'struct kvm_dirty_gfn'. The ring buffer is > > > pushed by host when page becomes dirty and pulled by userspace. A vcpu > > > exit is forced when the ring buffer becomes full. The ring buffers on > > > all vcpus can be reset by ioctl command KVM_RESET_DIRTY_RINGS. > > > > > > Yes, I think so. Adding a per-CPU ring results in extra memory allocation. > > > However, it's avoiding synchronization among multiple vcpus when dirty > > > pages happen on multiple vcpus. More discussion can be found from [1] > > > > Oh, I totally buy the relaxation of the synchronisation (though I > > doubt this will have any visible effect until we have something like > > Oliver's patches to allow parallel faulting). > > > > Heh, yeah I need to get that out the door. I'll also note that Gavin's > changes are still relevant without that series, as we do write unprotect > in parallel at PTE granularity after commit f783ef1c0e82 ("KVM: arm64: > Add fast path to handle permission relaxation during dirty logging"). Ah, true. Now if only someone could explain how the whole producer-consumer thing works without a trace of a barrier, that'd be great... Thanks, M. -- Without deviation from the norm, progress is not possible.