On Fri, Mar 22, 2024 at 05:36:00PM -0700, Linus Torvalds wrote: > On Fri, 22 Mar 2024 at 17:21, Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote: > > > > Besides that there's cross arch support to think about - it's hard to > > imagine us ever ditching our own atomics. > > Well, that's one of the advantages of using compiler builtins - > projects that do want cross-architecture support, but that aren't > actually maintaining their _own_ architecture support. > > So I very much see the lure of compiler support for that kind of > situation - to write portable code without having to know or care > about architecture details. > > This is one reason I think the kernel is kind of odd and special - > because in the kernel, we obviously very fundamentally have to care > about the architecture details _anyway_, so then having the > architecture also define things like atomics is just a pretty small > (and relatively straightforward) detail. > > The same argument goes for compiler builtins vs inline asm. In the > kernel, we have to have people who are intimately familiar with the > architecture _anyway_, so inline asms and architecture-specific header > files aren't some big pain-point: they'd be needed _anyway_. > > But in some random user level program, where all you want is an > efficient way to do "find first bit"? Then using a compiler intrinsic > makes a lot more sense. We've got a whole spectrum of kernel code though, and a lot of it is code that - honestly, we'd be better off if it wasn't specific to the kernel. rhashtable comes to mind; it's a fully generic, excellent at what it does, but it's had a number of annoyingly subtle bugs and sharp edges over the years that are really just a result of it not having enough users. So I see some real value in regularizing things. > > I was thinking about something more incremental - just an optional mode > > where our atomics were C atomics underneath. It'd probably give the > > compiler people a much more effective way to test their stuff than > > anything they have now. > > I suspect it might be painful, and some compiler people would throw > their hands up in horror, because the C++ atomics model is based > fairly solidly on atomic types, and the kernel memory model is much > more fluid. > > Boqun already mentioned the "mixing access sizes", which is actually > quite fundamental in the kernel, where we play lots of games with that > (typically around locking, where you find patterns line unlock writing > a zero to a single byte, even though the whole lock data structure is > a word). And sometimes the access size games are very explicit (eg > lib/lockref.c). I don't think mixing access sizes should be a real barrier. On the read side we can obviously do that with a helper; the write side needs compiler help, but "writing just a byte out of a word" is no different from a compiler POV that "write a single bit", and we can already mix atomic_or() with atomic_add(), with both C atomics and LKMM atomics. > But it actually goes deeper than that. While we do have "atomic_t" etc > for arithmetic atomics, and that probably would map fairly well to C++ > atomics, in other cases we simply base our atomics not on _types_, but > on code. > > IOW, we do things like "cmpxchg()", and the target of that atomic > access is just a regular data structure field. Well, some of that's historical cruft; cmpxchg() and atomic_cmpxchg() have different orderings, and we can specify that more directly now. But we definitely need the ability to cmpxchg() any struct of a size the machine supports atomic access to. Rust should be able to manage that more easily than C/C++ though - they've got a type system that can sanely represent that.