On 12/13/2013 09:43 PM, Ingo Molnar wrote: > > * Alex Shi <alex.shi@xxxxxxxxxx> wrote: > >> On 12/13/2013 09:02 AM, Alex Shi wrote: >>>>> You have not replied to this concern of mine: if my concern is valid >>>>> then that invalidates much of the current tunings. >>> The benefit from pretend flush range is not unconditional, since invlpg >>> also cost time. And different CPU has different invlpg/flush_all >>> execution time. >> >> TLB refill time is also different on different kind of cpu. >> >> BTW, >> A bewitching idea is till attracting me. >> https://lkml.org/lkml/2012/5/23/148 >> Even it was sentenced to death by HPA. >> https://lkml.org/lkml/2012/5/24/143 > > I don't think it was sentenced to death by HPA. What do the hardware > guys say, is this safe on current CPUs? This talking is fully public, no any other info I known. At that time, I tried core2, nhm, wsm, snd, ivb, all kinds of machine I can get. No issue found. And assuming a rebase patch is testing in Fengguang's testing system from last Friday, no bad news till now. Fengugang, x86-tlb branch on my github tree. > > If yes then as long as we only activate this optimization for known > models (and turn it off for unknown models) we should be pretty safe, > even if the hw guys (obviously) don't want to promise this > indefinitely for all Intel HT implementations in the future, right? Agree with you. > >> That is that just flush one of thread TLB is enough for SMT/HT, >> seems TLB is still shared in core on Intel CPU. This benefit is >> unconditional, and if my memory right, Kbuild testing can improve >> about 1~2% in average level. > > Oh, a 1-2% kbuild speedup is absolutely _massive_. Don't even think > about dropping this idea ... it needs to be explored. > > Alas, that for_each_cpu() loop is obviously disgusting, these values > should be precalculated into percpu variables and such. yes, pr-calcucatied variable would save much time. > >> So could you like to accept some ugly quirks to do this lazy TLB >> flush on known working CPU? > > it's not really 'lazy TLB flush' AFAICS but a genuine optimization: > only flush the TLB on the logical CPUs that need it, right? I.e. do > only one flush per pair of siblings. > >> Forgive me if it's stupid. > > I'd say measurable speedups that are safe are never ever stupid. Thanks a lot! > > And even the range-flush TLB optimization we are talking about here > could still be used IMO, just tone it down a bit and make it less > model dependent. > > Thanks, > > Ingo > -- Thanks Alex -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>