On 05/15/2012 09:27 PM, Luming Yu wrote: > On Tue, May 15, 2012 at 9:06 PM, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote: >> On Tue, 2012-05-15 at 20:58 +0800, Luming Yu wrote: >>> >>> >>> Both __native_flush_tlb() and __native_flush_tlb_single(...) >>> introduced roughly 1 ns latency to tsc sampling executed in > > Fix typo, I just observed 1us with current tool, I would check if I > can push the accuracy to nanoseconds level. > >>> stop_machine_context in two logical CPUs >> >> But you have to weight that against the cost of re-population, and > > Right, it's hard to detect, but I will try if I can get measurement > done in a simple test tool to help people measure > this kind of stuff in few minutes. > >> that's the difficult bit, since we have no clue how many tlb entries are >> in use by the current cr3. >> >> It might be possible for intel to give us this information, I've asked >> for something similar for cachelines. > > This is the official document > http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf > Please, such huge documents! and it also has no such info. > Let me know if it can answer your question. > >> -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html