Re: [PATCH v5 6/7] x86/tlb: optimizing flush_tlb_mm

Alex Shi <alex.shi@xxxxxxxxx> · Tue, 15 May 2012 21:28:50 +0800

On 05/15/2012 09:27 PM, Luming Yu wrote:

> On Tue, May 15, 2012 at 9:06 PM, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
>> On Tue, 2012-05-15 at 20:58 +0800, Luming Yu wrote:
>>>
>>>
>>> Both __native_flush_tlb() and __native_flush_tlb_single(...)
>>> introduced roughly 1 ns  latency to tsc sampling executed in
> 
> Fix typo, I just observed 1us with current tool, I would check if I
> can push the accuracy to nanoseconds level.
> 
>>> stop_machine_context in two logical CPUs
>>
>> But you have to weight that against the cost of re-population, and
> 
> Right, it's hard to detect, but I will try if I can get measurement
> done in a simple test tool to help people measure
> this kind of stuff in few minutes.
> 
>> that's the difficult bit, since we have no clue how many tlb entries are
>> in use by the current cr3.
>>
>> It might be possible for intel to give us this information, I've asked
>> for something similar for cachelines.
> 
> This is the official document
> http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf
> 

Please, such huge documents! and it also has no such info.

> Let me know if it can answer your question.
> 
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html