Hi, Arnd, On Mon, Mar 21, 2022 at 5:42 PM Arnd Bergmann <arnd@xxxxxxxx> wrote: > > On Sat, Mar 19, 2022 at 3:31 PM Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > LoongArch has no native sub-word xchg/cmpxchg instructions now, but > > LoongArch-based CPUs support NUMA (e.g., quad-core Loongson-3A5000 > > supports as many as 16 nodes, 64 cores in total). So, we emulate sub- > > word xchg/cmpxchg in software and use qspinlock/qrwlock rather than > > ticket locks. > ... > > +extern unsigned long __xchg_small(volatile void *ptr, unsigned long x, > > + unsigned int size); > > + > > +static inline unsigned long __xchg(volatile void *ptr, unsigned long x, > > + int size) > > +{ > > + switch (size) { > > + case 1: > > + case 2: > > + return __xchg_small(ptr, x, size); > > + > > I think it's better to not define the "small" versions at all, since they are > inefficient and probably not safe to use for the few things that try to call > them, such as the qspinlock implementation. > > I have an experimental patch set that removes these from the kernel > altogether and makes xchg()/cmpxchg() only work on 32-bit or > 64-bit values. > > > diff --git a/arch/loongarch/include/asm/spinlock.h b/arch/loongarch/include/asm/spinlock.h > > new file mode 100644 > > index 000000000000..7cb3476999be > > --- /dev/null > > +++ b/arch/loongarch/include/asm/spinlock.h > > + > > +#include <asm/processor.h> > > +#include <asm/qspinlock.h> > > +#include <asm/qrwlock.h> > > + > > There is a patch series from Peter Zijlstra, Palmer Dabbelt and Guo Ren > that is currently under review for risc-v and csky, to add a generic ticket lock > implementation that does not rely on sub-word atomics [1]. I think we > also want to convert mips, xtensa, openrisc, and sparc64 to use that, > since they have the same issue with the lack of 16-bit atomics. > > Please coordinate the inclusion of the patches with them and use that > spinlock implementation for the initial merge, to avoid further discussion > on the topic. If at a later point you are able to come up with a qspinlock > implementation that has convincing forward-progress guarantees and > can be shown to be better, we can revisit this. In my opinion, forward-progress is solved in V2, since we have reworked __xchg_small()/__cmpxchg_small(), and qspinlock is needed by NUMA. However, if the generic ticket lock is merged later, I will try to use it at present. Huacai > > Arnd > > [1] https://lore.kernel.org/lkml/20220319035457.2214979-1-guoren@xxxxxxxxxx/