On Mon, Mar 25, 2024 at 10:44:45AM +0000, Mark Rutland wrote: [...] > > > > * I choose to re-implement atomics in Rust `asm` because we are still > > figuring out how we can make it easy and maintainable for Rust to call > > a C function _inlinely_ (Gary makes some progress [2]). Otherwise, > > atomic primitives would be function calls, and that can be performance > > bottleneck in a few cases. > > I don't think we want to maintain two copies of each architecture's atomics. > This gets painful very quickly (e.g. as arm64's atomics get patched between > LL/SC and LSE forms). > No argument here ;-) > Can we start off with out-of-line atomics, and see where the bottlenecks are? > > It's relatively easy to do that today, at least for the atomic*_*() APIs: > > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=atomics/outlined&id=e0a77bfa63e7416d610769aa4ab62bc06993ce56 > > ... which IIUC covers the "AtomicI32, AtomicI64 and AtomicUsize" cases you > mention above. > Thanks! Yes, I know I should check with you before I finalize the implementation ;-) I will try to integrate that but things to notice: * For module usage, we need to EXPORT_SYMBOL_GPL() all the atomics, I'm OK with that, but I don't know how others feel about it. * Alice reported performance gap between inline and out-of-line refcount operations in Rust binder driver: https://github.com/Darksonn/linux/commit/b4be1bd6c44225bf7276a4666fd30b8da9cba517 I don't know how much worse since I don't have the data, but that's one of the reasons I started with inline asm. That being said, I totally agree that we could start with out-of-line atomics, and maybe provide inline version for performance critical paths. Hoping is we can figure out how Rust could inline a C function eventually. Regards, Boqun > Mark.