On 06/08/2018 09:51, Xiao Guangrong wrote: > > > On 07/27/2018 11:46 PM, Paolo Bonzini wrote: >> We are currently cutting hva_to_pfn_fast short if we do not want an >> immediate exit, which is represented by !async && !atomic. However, >> this is unnecessary, and __get_user_pages_fast is *much* faster >> because the regular get_user_pages takes pmd_lock/pte_lock. >> In fact, when many CPUs take a nested vmexit at the same time >> the contention on those locks is visible, and this patch removes >> about 25% (compared to 4.18) from vmexit.flat on a 16 vCPU >> nested guest. >> > > Nice improvement. > > Then after that, we will unconditionally try hva_to_pfn_fast(), does > it hurt the case that the mappings in the host's page tables have not > been present yet? I don't think so, because that's quite slow anyway. > Can we apply this tech to other places using gup or even squash it > into get_user_pages()? That may make sense. Andrea, do you have an idea? Paolo