On Thu, Jan 09, 2020 at 05:28:36PM -0500, Michael S. Tsirkin wrote: > On Thu, Jan 09, 2020 at 02:39:49PM -0500, Peter Xu wrote: > > On Thu, Jan 09, 2020 at 02:08:52PM -0500, Michael S. Tsirkin wrote: > > > On Thu, Jan 09, 2020 at 12:08:49PM -0500, Peter Xu wrote: > > > > On Thu, Jan 09, 2020 at 11:40:23AM -0500, Michael S. Tsirkin wrote: > > > > > > > > [...] > > > > > > > > > > > I know it's mostly relevant for huge VMs, but OTOH these > > > > > > > probably use huge pages. > > > > > > > > > > > > Yes huge VMs could benefit more, especially if the dirty rate is not > > > > > > that high, I believe. Though, could you elaborate on why huge pages > > > > > > are special here? > > > > > > > > > > > > Thanks, > > > > > > > > > > With hugetlbfs there are less bits to test: e.g. with 2M pages a single > > > > > bit set marks 512 pages as dirty. We do not take advantage of this > > > > > but it looks like a rather obvious optimization. > > > > > > > > Right, but isn't that the trade-off between granularity of dirty > > > > tracking and how easy it is to collect the dirty bits? Say, it'll be > > > > merely impossible to migrate 1G-huge-page-backed guests if we track > > > > dirty bits using huge page granularity, since each touch of guest > > > > memory will cause another 1G memory to be transferred even if most of > > > > the content is the same. 2M can be somewhere in the middle, but still > > > > the same write amplify issue exists. > > > > > > > > > > OK I see I'm unclear. > > > > > > IIUC at the moment KVM never uses huge pages if any part of the huge page is > > > tracked. > > > > To be more precise - I think it's per-memslot. Say, if the memslot is > > dirty tracked, then no huge page on the host on that memslot (even if > > guest used huge page over that). > > Yea ... so does it make sense to make this implementation detail > leak through UAPI? I think that's not a leak of internal implementation detail, we just define the interface as that the address for each kvm_dirty_gfn is always host page aligned (by default it means no huge page) and point to a single host page, that's all. Host page size is always there for userspace after all so imho it's fine. Thanks, -- Peter Xu