Re: [PATCH v5 6/7] x86/tlb: optimizing flush_tlb_mm

Luming Yu <luming.yu@xxxxxxxxx> · Tue, 15 May 2012 21:27:10 +0800

On Tue, May 15, 2012 at 9:06 PM, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
> On Tue, 2012-05-15 at 20:58 +0800, Luming Yu wrote:
>>
>>
>> Both __native_flush_tlb() and __native_flush_tlb_single(...)
>> introduced roughly 1 ns  latency to tsc sampling executed in

Fix typo, I just observed 1us with current tool, I would check if I
can push the accuracy to nanoseconds level.

>> stop_machine_context in two logical CPUs
>
> But you have to weight that against the cost of re-population, and

Right, it's hard to detect, but I will try if I can get measurement
done in a simple test tool to help people measure
this kind of stuff in few minutes.

> that's the difficult bit, since we have no clue how many tlb entries are
> in use by the current cr3.
>
> It might be possible for intel to give us this information, I've asked
> for something similar for cachelines.

This is the official document
http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf

Let me know if it can answer your question.

>
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html