Re: [PATCH RFC 1/2] arch: Introduce ARCH_HAS_HW_XCHG_SMALL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 27, 2021 at 1:03 AM Boqun Feng <boqun.feng@xxxxxxxxx> wrote:
>
> On Tue, Jul 27, 2021 at 12:41:34AM +0800, Guo Ren wrote:
> > On Mon, Jul 26, 2021 at 6:39 PM Boqun Feng <boqun.feng@xxxxxxxxx> wrote:
> > >
> > > On Mon, Jul 26, 2021 at 04:56:49PM +0800, Huacai Chen wrote:
> > > > Hi, Geert,
> > > >
> > > > On Mon, Jul 26, 2021 at 4:36 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > Hi Huacai,
> > > > >
> > > > > On Sat, Jul 24, 2021 at 2:36 PM Huacai Chen <chenhuacai@xxxxxxxxxxx> wrote:
> > > > > > Introduce a new Kconfig option ARCH_HAS_HW_XCHG_SMALL, which means arch
> > > > > > has hardware sub-word xchg/cmpxchg support. This option will be used as
> > > > > > an indicator to select the bit-field definition in the qspinlock data
> > > > > > structure.
> > > > > >
> > > > > > Signed-off-by: Huacai Chen <chenhuacai@xxxxxxxxxxx>
> > > > >
> > > > > Thanks for your patch!
> > > > >
> > > > > > --- a/arch/Kconfig
> > > > > > +++ b/arch/Kconfig
> > > > > > @@ -228,6 +228,10 @@ config ARCH_HAS_FORTIFY_SOURCE
> > > > > >           An architecture should select this when it can successfully
> > > > > >           build and run with CONFIG_FORTIFY_SOURCE.
> > > > > >
> > > > > > +# Select if arch has hardware sub-word xchg/cmpxchg support
> > > > > > +config ARCH_HAS_HW_XCHG_SMALL
> > > > >
> > > > > What do you mean by "hardware"?
> > > > > Does a software fallback count?
> > > > This new option is supposed as an indicator to select bit-field
> > > > definition of qspinlock, software fallback is not helpful in this
> > > > case.
> > > >
> > >
> > > I don't think this is true. IIUC, the rationale of the config is that
> > > for some architectures, since the architectural cmpxchg doesn't provide
> > > forward-progress guarantee then using cmpxchg of machine-word to
> > > implement xchg{8,16}() may cause livelock, therefore these architectures
> > > don't want to provide xchg{8,16}(), as a result they cannot work with
> > > qspinlock when _Q_PENDING_BITS is 8.
> > >
> > > So as long as an architecture can provide and has already provided an
> > > implementation of xchg{8,16}() which guarantee forward-progress (even
> > > though the implementation is using a machine-word size cmpxchg), the
> > > architecture doesn't need to select ARCH_HAS_HW_XCHG_SMALL.
> > Seems only atomic could provide forward progress, isn't it? And lr/sc
> > couldn't implement xchg/cmpxchg primitive properly.
> >
>
> I'm missing you point here, a) ll/sc can provide forward progress and b)
> ll/sc instructions are used to implement xchg/cmpxchg (see ARM64 and
> PPC).
I don't think arm64 could provide fwd guarantee with ll/sc, otherwise,
they wouldn't add ARM64_HAS_LSE_ATOMICS for large systems.

>
> > How to make CPU ç  "load + cmpxchg" forward-progress? Fusion
> > these instructions and lock the snoop channel?
> > Maybe hardware guys would think that it's easier to implement cas +
> > dcas + amo(short & byte).
> >
>
> Please note that if _Q_PENDING_BITS == 1, then the xchg_tail() is
> implemented as a "load + cmpxchg", so if "load + cmpxchg" implementation
> of xchg16() doesn't provide forward-progress in an architecture, neither
> does xchg_tail().
That's the problem of "_Q_PENDING_BITS == 1", no hardware could
provide "load + ALU + cas" fwd guarantee!

A simple example, atomic a++:
c = READ_ONCE(g_value);
new = c + 1;
while ((old = cmpxchg(&g_value, c, new)) != c) {
    c = old;
    new = c + 1;
}

Q: When it runs on CPU0(500Mhz) & CPU1(2Ghz) in one SMP, how do we
prevent CPU1 from starving CPU0?
A: I think the answer is using AMO-add instruction:
atomic_add(1, &g_value);
(If the arch hasn't atomic instructions and using cmpxchg or lr/sc
implement atomic, it's the same problem.)

>
> Regards,
> Boqun
>
> > >
> > > Regards,
> > > Boqun
> > >
> > > > >
> > > > > > --- a/arch/m68k/Kconfig
> > > > > > +++ b/arch/m68k/Kconfig
> > > > > > @@ -5,6 +5,7 @@ config M68K
> > > > > >         select ARCH_32BIT_OFF_T
> > > > > >         select ARCH_HAS_BINFMT_FLAT
> > > > > >         select ARCH_HAS_DMA_PREP_COHERENT if HAS_DMA && MMU && !COLDFIRE
> > > > > > +       select ARCH_HAS_HW_XCHG_SMALL
> > > > >
> > > > > M68k CPUs which support the CAS (Compare And Set) instruction do
> > > > > support this on 8-bit, 16-bit, and 32-bit quantities.
> > > > > M68k CPUs which lack CAS use a software implementation, which
> > > > > supports the same quantities.
> > > > >
> > > > > As CAS is used only if CONFIG_RMW_INSNS=y, perhaps this needs
> > > > > a dependency?
> > > > OK, I think this dependency is needed.
> > > >
> > > > Huacai
> > > >
> > > > >
> > > > >    select ARCH_HAS_HW_XCHG_SMALL if RMW_INSNS
> > > > >
> > > > > Gr{oetje,eeting}s,
> > > > >
> > > > >                         Geert
> > > > >
> > > > > --
> > > > > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx
> > > > >
> > > > > In personal conversations with technical people, I call myself a hacker. But
> > > > > when I'm talking to journalists I just say "programmer" or something like that.
> > > > >                                 -- Linus Torvalds
> >
> >
> >
> > --
> > Best Regards
> >  Guo Ren
> >
> > ML: https://lore.kernel.org/linux-csky/



-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/




[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux