Hi, Arnd, On Thu, Jun 23, 2022 at 4:26 PM Arnd Bergmann <arnd@xxxxxxxx> wrote: > > On Thu, Jun 23, 2022 at 9:56 AM Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > On Thu, Jun 23, 2022 at 1:45 PM Guo Ren <guoren@xxxxxxxxxx> wrote: > > > > > > On Thu, Jun 23, 2022 at 12:46 PM Huacai Chen <chenhuacai@xxxxxxxxxxx> wrote: > > > > > > > > On NUMA system, the performance of qspinlock is better than generic > > > > spinlock. Below is the UnixBench test results on a 8 nodes (4 cores > > > > per node, 32 cores in total) machine. > > You are still missing an explanation here about why this is safe to > do. Is there are > architectural guarantee for forward progress, or do you rely on > specific microarchitectural > behavior? In my understanding, "guarantee for forward progress" means to avoid many ll/sc happening at the same time and no one succeeds. LoongArch uses "exclusive access (with timeout) of ll" to avoid simultaneous ll (it also blocks other memory load/store on the same address), and uses "random delay of sc" to avoid simultaneous sc (introduced in CPUCFG3, bit 3 and bit 4 [1]). This mechanism can guarantee forward progress in practice. [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#_cpucfg Huacai > > > > Could you base the patch on [1]? > > > > > > [1] https://lore.kernel.org/linux-riscv/20220621144920.2945595-2-guoren@xxxxxxxxxx/raw > > I found that whether we use qspinlock or tspinlock, we always use > > qrwlock, so maybe it is better like this? > > > > #ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS > > #include <asm/qspinlock.h> > > #else > > #include <asm-generic/tspinlock.h> > > #endif > > > > #include <asm/qrwlock.h> > > Yes, that seems better, but I would go one step further and include > asm-generic/qspinlock.h > in place of asm/qspinlock.h here: The two architectures that have a > custom asm/qspinlock.h > also have a custom asm/spinlock.h, so they have no need to include > asm-generic/spinlock.h > either. > > Arnd