Re: [PATCH v5 6/7] x86/tlb: optimizing flush_tlb_mm

Luming Yu <luming.yu@xxxxxxxxx> · Tue, 15 May 2012 20:58:54 +0800

On Tue, May 15, 2012 at 5:17 PM, Nick Piggin <npiggin@xxxxxxxxx> wrote:
> On 15 May 2012 19:15, Nick Piggin <npiggin@xxxxxxxxx> wrote:
>> So this should go to linux-arch...
>>
>> On 15 May 2012 18:55, Alex Shi <alex.shi@xxxxxxxxx> wrote:
>>> Not every flush_tlb_mm execution moment is really need to evacuate all
>>> TLB entries, like in munmap, just few 'invlpg' is better for whole
>>> process performance, since it leaves most of TLB entries for later
>>> accessing.
>
> Did you have microbenchmarks for this like your mprotect numbers,
> by the way? Test munmap numbers and see how that looks. Also,

Might be off topic, but I just spent few minutes to test out the difference
between write CR3 vs. invlpg on a pretty old but still reliable P4 desktop
with my simple hardware latency and bandwidth test tool I posted for
RFC several weeks ago on LKML.

Both __native_flush_tlb() and __native_flush_tlb_single(...)
introduced roughly 1 ns  latency to tsc sampling executed in
stop_machine_context in two logical CPUs

Just to fuel the discussion. :-)

Cheers,
/l
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html