In v15 of my patch set which can be found here (https://lore.kernel.org/lkml/20191205161928.19548.41654.stgit@localhost.localdomain/) I had introduced an RFC patch that used MADV_FREE in QEMU instead of MADV_DONTNEED. When testing that I had used a next-20191120 kernel on the host. When preparing the numbers for my latest version I had updated the host to next-20191219, and that is where I encountered an issue where MADV_FREE is significantly slower than MADV_DONTNEED when used to report the pages from the QEMU to the kernel and then eventually fault them back into the guest. No regression was seen with MADV_DONTNEED. I just wanted to put it out there that it looks like something has added spinlock overhead as high as 60% for 16 cores using MADV_FREE to notify the system that a given transparent huge page isn't needed, and then eventually faulting the memory back in. I'll try to bisect this as time permits, but just thought I would put this out there in case somebody has already found something similar and gotten root cause. Thanks. - Alex