Re: [PATCH] LoongArch: Add qspinlock support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 18, 2022 at 12:11 AM Arnd Bergmann <arnd@xxxxxxxx> wrote:
>
> On Fri, Jun 17, 2022 at 4:57 PM Huacai Chen <chenhuacai@xxxxxxxxxxx> wrote:
> >
> > On NUMA system, the performance of qspinlock is better than generic
> > spinlock. Below is the UnixBench test results on a 8 nodes (4 cores
> > per node, 32 cores in total) machine.
> >
>
> The performance increase is nice, but this is only half the story we need here.
>
> I think the more important bit is how you can guarantee that the xchg16()
> implementation is correct and always allows forward progress.
>
> >@@ -123,6 +123,10 @@ static inline unsigned long __percpu_xchg(void *ptr, unsigned long val,
> >                                                int size)
> > {
> >        switch (size) {
> >+       case 1:
> >+       case 2:
> >+               return __xchg_small((volatile void *)ptr, val, size);
> >+
>
> Do you actually need the size 1 as well?
>
> Generally speaking, I would like to rework the xchg()/cmpxchg() logic
> to only cover the 32-bit and word-sized (possibly 64-bit) case, while
> having separate optional 8-bit and 16-bit functions. I had a patch for
Why not prevent 8-bit and 16-bit xchg()/cmpxchg() directly? eg: move
qspinlock xchg_tail to per arch_xchg_tail.
That means Linux doesn't provide a mixed-size atomic operation primitive.

What does your "separate optional 8-bit and 16-bit functions" mean here?

> this in the past, and can try to dig that out, this may be the time to
> finally do that.
>
> I see that the qspinlock() code actually calls a 'relaxed' version of xchg16(),
> but you only implement the one with the full barrier. Is it possible to
> directly provide a relaxed version that has something less than the
> __WEAK_LLSC_MB?
I am also curious that __WEAK_LLSC_MB is very magic. How does it
prevent preceded accesses from happening after sc for a strong
cmpxchg?

#define __cmpxchg_asm(ld, st, m, old, new)                              \
({                                                                      \
        __typeof(old) __ret;                                            \
                                                                        \
        __asm__ __volatile__(                                           \
        "1:     " ld "  %0, %2          # __cmpxchg_asm \n"             \
        "       bne     %0, %z3, 2f                     \n"             \
        "       or      $t0, %z4, $zero                 \n"             \
        "       " st "  $t0, %1                         \n"             \
        "       beq     $zero, $t0, 1b                  \n"             \
        "2:                                             \n"             \
        __WEAK_LLSC_MB                                                  \

And its __smp_mb__xxx are just defined as a compiler barrier()?
#define __smp_mb__before_atomic()       barrier()
#define __smp_mb__after_atomic()        barrier()

>
>        Arnd



-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux