On Thu, Jan 19, 2023 at 11:42 AM James Houghton <jthoughton@xxxxxxxxxx> wrote: > > On Thu, Jan 19, 2023 at 9:32 AM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > > > > On 01/19/23 08:57, James Houghton wrote: > > > FWIW, what makes the most sense to me right now is to implement the > > > THP-like scheme and mark HGM as mutually exclusive with the vmemmap > > > optimization. We can later come up with a scheme that lets us retain > > > compatibility. (Is that what you mean by "this can be done somewhat > > > independently", Mike?) > > > > Sort of, I was only saying that getting the ref/map counting right seems > > like a task than can be independently worked. Using the THP-like scheme > > is good. > > Ok! So if you're ok with the intermediate mapping sizes, it sounds > like I should go ahead and implement the THP-like scheme. It turns out that the THP-like scheme significantly slows down MADV_COLLAPSE: decrementing the mapcounts for the 4K subpages becomes the vast majority of the time spent in MADV_COLLAPSE when collapsing 1G mappings. It is doing 262k atomic decrements, so this makes sense. This is only really a problem because this is done between mmu_notifier_invalidate_range_start() and mmu_notifier_invalidate_range_end(), so KVM won't allow vCPUs to access any of the 1G page while we're doing this (and it can take like ~1 second for each 1G, at least on the x86 server I was testing on). - James