On Fri, Nov 20, 2020 at 02:33:58AM +0100, Thomas Gleixner wrote: > On Thu, Nov 19 2020 at 19:28, Peter Zijlstra wrote: > > On Thu, Nov 19, 2020 at 09:23:47AM -0800, Linus Torvalds wrote: > >> Because this is certainly not the only time migration limiting has > >> come up, and no, it has absolutely nothing to do with per-cpu page > >> tables being completely unacceptable. > > > > It is for this instance; but sure, it's come up before in other > > contexts. > > Indeed. And one of the really bad outcomes of this is that people are > forced to use preempt_disable() to prevent migration which entails a > slew of consequences: > > - Using spinlocks where it wouldn't be needed otherwise > - Spinwaiting instead of sleeping > - The whole crazyness of doing copy_to/from_user_in_atomic() along > with the necessary out of line error handling. > - .... > > The introduction of per-cpu storage happened almost 20 years ago (2002) > and still the only answer we have is preempt_disable(). IIRC the first time this migrate_disable() stuff came up was when Chris Lameter did SLUB. Eventually he settled for that cmpxchg_double() approach (which is somewhat similar to userspace rseq) which is vastly superiour and wouldn't have happened had we provided migrate_disable(). As already stated, per-cpu page-tables would allow for a much saner kmap approach, but alas, x86 really can't sanely do that (the archs that have separate kernel and user page-tables could do this, and how we cursed x86 didn't have that when meltdown happened). [ and using fixmaps in the per-cpu memory space _could_ work, but is a giant pain because then all accesses need GS prefix and blah... ] And I'm sure there's creative ways for other problems too, but yes, it's hard. Anyway, clearly I'm the only one that cares, so I'll just crawl back under my rock...