On Tue, May 15, 2012 at 8:58 PM, Luming Yu <luming.yu@xxxxxxxxx> wrote: > On Tue, May 15, 2012 at 5:17 PM, Nick Piggin <npiggin@xxxxxxxxx> wrote: >> On 15 May 2012 19:15, Nick Piggin <npiggin@xxxxxxxxx> wrote: >>> So this should go to linux-arch... >>> >>> On 15 May 2012 18:55, Alex Shi <alex.shi@xxxxxxxxx> wrote: >>>> Not every flush_tlb_mm execution moment is really need to evacuate all >>>> TLB entries, like in munmap, just few 'invlpg' is better for whole >>>> process performance, since it leaves most of TLB entries for later >>>> accessing. >> >> Did you have microbenchmarks for this like your mprotect numbers, >> by the way? Test munmap numbers and see how that looks. Also, > > Might be off topic, but I just spent few minutes to test out the difference > between write CR3 vs. invlpg on a pretty old but still reliable P4 desktop > with my simple hardware latency and bandwidth test tool I posted for > RFC several weeks ago on LKML. > > Both __native_flush_tlb() and __native_flush_tlb_single(...) > introduced roughly 1 ns latency to tsc sampling executed in sorry, typo, 1us.. but I should capture nanosecond data. :-( > stop_machine_context in two logical CPUs > > Just to fuel the discussion. :-) > > Cheers, > /l -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html