On Thu, 23 Aug 2018 15:21:30 +1000 Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx> wrote: > On Wed, 2018-08-22 at 22:11 -0700, Linus Torvalds wrote: > > On Wed, Aug 22, 2018 at 9:54 PM Benjamin Herrenschmidt <benh@xxxxxxxxxxx> wrote: > > > > > > > > > So we do need a different flush instruction for the page tables vs. the > > > normal TLB pages. > > > > Right. ARM wants it too. x86 is odd in that a regular "invlpg" already > > invalidates all the internal tlb cache nodes. > > > > So the "new world order" is exactly that patch that PeterZ sent you, that adds a > > > > + unsigned int freed_tables : 1; > > > > .../... > > > So instead, when you get to the actual "tlb_flush(tlb)", you do > > exactly that - flush the tlb. And the mmu_gather structure shows you > > how much you need to flush. If you see that "freed_tables" is set, > > then you know that you need to also do the special instruction to > > flush the inner level caches. The range continues to show the page > > range. > > Yup. That looks like a generic version of the "need_flush_all" flag we > have, which is fine by us. > > Just don't blame powerpc for all the historical crap :-) And yes we very much want to remove the x86 hacks from generic code and have them use the sane powerpc/radix page walk cache flushing model. That would indeed allow us to stop overriding those macros and start sharing more code with other archs. We can help write or review code to make sure bugs don't creep in when moving it to generic implementation. It will be much more relevant to us now because radix is very similar to x86. The hack is not the powerpc override macros though, let's be clear about that. Even x86 will be helped out by removing that crap because it won't have to do a full TLB flush caused by massive TLB range if it frees 0..small number of pages that happen to also free some page tables. Thanks, Nick