On Thu, Jul 23, 2015 at 12:05:21AM +0100, Dave Hansen wrote: > On 07/22/2015 03:48 PM, Catalin Marinas wrote: > > You are right, on x86 the tlb_single_page_flush_ceiling seems to be > > 33, so for an HPAGE_SIZE range the code does a local_flush_tlb() > > always. I would say a single page TLB flush is more efficient than a > > whole TLB flush but I'm not familiar enough with x86. > > The last time I looked, the instruction to invalidate a single page is > more expensive than the instruction to flush the entire TLB. I was thinking of the overall cost of re-populating the TLB after being nuked rather than the instruction itself. > We also don't bother doing ranged flushes _ever_ for hugetlbfs TLB > invalidations, but that was just because the work done around commit > e7b52ffd4 didn't see any benefit. For huge pages, there are indeed fewer page table levels to fetch, so I guess the impact is not significant. With virtualisation/nested pages, at least on ARM, refilling the TLB for guest would take longer (though it's highly dependent on the microarchitecture implementation, whether it caches the guest PA to host PA separately). > That said, I can't imagine this will hurt anything. We also have TLBs > that can mix 2M and 4k pages and I don't think we did back when we put > that code in originally. Another question is whether flushing a single address is enough for a huge page. I assumed it is since tlb_remove_pmd_tlb_entry() only adjusts the mmu_gather range by PAGE_SIZE (rather than HPAGE_SIZE) and no-one complained so far. AFAICT, there are only 3 architectures that don't use asm-generic/tlb.h but they all seem to handle this case: arch/arm: it implements tlb_remove_pmd_tlb_entry() in a similar way to the generic one arch/s390: tlb_remove_pmd_tlb_entry() is a no-op arch/ia64: does not support THP -- Catalin -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href