On 07/23/2015 07:13 AM, Andrea Arcangeli wrote: > On Thu, Jul 23, 2015 at 11:49:38AM +0100, Catalin Marinas wrote: >> On Thu, Jul 23, 2015 at 12:05:21AM +0100, Dave Hansen wrote: >>> On 07/22/2015 03:48 PM, Catalin Marinas wrote: >>>> You are right, on x86 the tlb_single_page_flush_ceiling seems to be >>>> 33, so for an HPAGE_SIZE range the code does a local_flush_tlb() >>>> always. I would say a single page TLB flush is more efficient than a >>>> whole TLB flush but I'm not familiar enough with x86. >>> >>> The last time I looked, the instruction to invalidate a single page is >>> more expensive than the instruction to flush the entire TLB. >> >> I was thinking of the overall cost of re-populating the TLB after being >> nuked rather than the instruction itself. > > Unless I'm not aware about timing differences in flushing 2MB TLB > entries vs flushing 4kb TLB entries with invlpg, the benchmarks that > have been run to tune the optimal tlb_single_page_flush_ceiling value, > should already guarantee us that this is a valid optimization (as we > just got one entry, we're not even close to the 33 ceiling that makes > it more a grey area). We had a discussion about this a few weeks ago: https://lkml.org/lkml/2015/6/25/666 The argument is that the CPU is so good at refilling the TLB that it rarely waits on it, so the "cost" can be very very low. >>> That said, I can't imagine this will hurt anything. We also have TLBs >>> that can mix 2M and 4k pages and I don't think we did back when we put >>> that code in originally. > > Dave, I'm confused about this. We should still stick to an invariant > that we can't ever mix 2M and 4k TLB entries if their mappings end up > overlapping on the same physical memory (if this isn't enforced in > common code, some x86 implementation errata triggers, and it really > oopses with machine checks so it's not just theoretical). Perhaps I > misunderstood what you meant with mix 2M and 4k pages though. On older CPUs we had dedicated 2M TLB slots. Now, we have an STLB that can hold 2M and 4k entries at the same time. That will surely change the performance profile enough that whatever testing we did in the past is fairly stale now. I didn't mean mixing 4k and 2M mappings for the same virtual address. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>