On Wed, Feb 05, 2020 at 10:58:14AM +0100, David Hildenbrand wrote: > On 05.02.20 10:49, Wang, Wei W wrote: > > On Wednesday, February 5, 2020 5:37 PM, David Hildenbrand wrote: > >>> > >>> Not sure how TCG tracks the dirty bits. But In whatever > >>> implementation, the hypervisor should have > >> > >> There is only a single bitmap for that purpose. (well, the one where KVM > >> syncs to) > >> > >>> already dealt with the race between he current round and the previous > >> round dirty recording. > >>> (the race isn't brought by this feature essentially) > >> > >> It is guaranteed to work reliably without this feature as you only clear what > >> *has been migrated*, > > > > Not "clear what has been migrated" (that skips nothing..) > > Anyway, it's a hint used for optimization. > > Yes, an optimization that might easily lead to data corruption when the > two bitmaps are either not in place or don't play along in that specific > way (and I suspect this is the case under TCG). So I checked and TCG has two copies too. Each block has bmap used for migration and also dirty_memory where pages are marked dirty. See cpu_physical_memory_sync_dirty_bitmap. So from QEMU POV, there is a callback that tells balloon when it's safe to request hints. As that affects the bitmap, that must not happen in parallel with dirty bitmap handling. Sounds like a reasonable limitation. The hint can be useful outside migration, but in its current form needs to then be non-destructive. E.g. I can imaging userspace calling MADV_SOFT_OFFLINE on the hinted memory. Again a flag that tells guest it should wait until used could be a reasonable expension. If we stick to the shrinker it's actually implementable easily. With an OOM notifier - I'm not so sure ... And a big part of the problem is that after all this time the page hinting interfaces are still undocumented. Quite sad really :( > -- > Thanks, > > David / dhildenb