On Wed, Sep 23 2020 at 12:19, peterz wrote: > On Mon, Sep 21, 2020 at 09:27:57PM +0200, Thomas Gleixner wrote: >> Alternatively this could of course be solved with per CPU page tables >> which will come around some day anyway I fear. > > Previously (with PTI) we looked at making the entire kernel map per-CPU, > and that takes a 2K copy on switch_mm() (or more general, the user part > of whatever the top level directory is for architectures that have a > shared kernel/user page-table setup in the first place). > > The idea was having a fixed per-cpu kernel page-table, share a bunch of > (kernel) page-tables between all CPUs and then copy in the user part on > switch. > > I've forgotten what the plan was for ASID/PCID in that scheme. > > For x86_64 we've been fearing the performance of that 2k copy, but I > don't think we've ever actually bit the bullet and implemented it to see > how bad it really is. I actually did at some point and depending on the workload the overhead was clearly measurable. And yes, it fell apart with PCID and I could not come up with a scheme for it which did not suck horribly. So I burried the patches in the poison cabinet. Aside of that, we'd need to implement that for a eight other architectures as well... Thanks, tglx