On 05/28/2013 08:36 AM, Marcelo Tosatti wrote: > On Thu, May 23, 2013 at 03:55:59AM +0800, Xiao Guangrong wrote: >> kvm_zap_obsolete_pages uses lock-break technique to zap pages, >> it will flush tlb every time when it does lock-break >> >> We can reload mmu on all vcpus after updating the generation >> number so that the obsolete pages are not used on any vcpus, >> after that we do not need to flush tlb when obsolete pages >> are zapped > > After that point batching is also not relevant anymore? no... without batching, we do not know how much time we will spend to zap pages. It is not good for the case that zap_all_pages is called in the vcpu context. > > > Still concerned about a similar case mentioned earlier: > > " > Note the account for pages freed step after pages are actually > freed: as discussed with Takuya, having pages freed and freed page > accounting out of sync across mmu_lock is potentially problematic: > kvm->arch.n_used_mmu_pages and friends do not reflect reality which can > cause problems for SLAB freeing and page allocation throttling. > " > > This is a real problem, if you decrease n_used_mmu_pages at > kvm_mmu_prepare_zap_page, but only actually free pages later > at kvm_mmu_commit_zap_page, there is the possibility of allowing > a huge number to be retained. There should be a maximum number of pages > at invalid_list. > > (even higher possibility if you schedule without freeing pages reported > as released!). > >> Note: kvm_mmu_commit_zap_page is still needed before free >> the pages since other vcpus may be doing locklessly shadow >> page walking Ah, yes, i agree with you. We can introduce a list, say kvm->arch.obsolte_pages, to link all of the zapped-page, the page-shrink will free the page on that list first. Marcelo, if you do not have objection on patch 1 ~ 8 and 11, could you please let them merged first, and do add some comments and tlb optimization later? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html