On Sat, Dec 05, 2015 at 01:54:51PM +0530, Aneesh Kumar K.V wrote: > If we can update mmu_gather to track the page size of the pages, that > will also help some archs to better implement tlb_flush(struct > mmu_gather *). Right now arch/powerpc/mm/tlb_nohash.c does flush the tlb > mapping for the entire mm_struct. > > we can also make sure that we do a force flush when we are trying to > gather pages of different size. So one instance of mmu_gather will end > up gathering pages of specific size only ? Tracking the TLB flush of multiple page sizes won't bring down the complexity of the fix though, in fact the multiple page sizes are arch-knowledge so such improvement would need to break the arch API of the MMU gather. THP is a common code abstraction, so the fix is self contained into the common code and it can't take more than one bit to encode the flush size because THP supports only one page size. To achieve the multiple TLB flush size we could use an array of unsigned long long physaddr where the bits below PAGE_SHIFT are the page order. That would however require a pfn_to_page then to free the page, so it's probably better to have the page struct and a order in two different fields and double up the array size of the MMU gather. Then we could as well look if we can go cross-mm so that it's usable for the rmap-walk too, which is what I was looking into when I found the THP SMP TLB flushing theoretical race. In my view this is even more complicated from an implementation standpoint because it isn't self contained in the common code. So I doubt it's worth mixing the optimization in arch code for hugetlbfs with the THP race fix that is all common code knowledge and it's actually a fix (albeit purely theoretical) and not an optimization. Thanks, Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>