Re: [PATCH RFC 1/2] arch: Introduce ARCH_HAS_HW_XCHG_SMALL

Guo Ren <guoren@xxxxxxxxxx> · Tue, 27 Jul 2021 09:27:51 +0800



On Tue, Jul 27, 2021 at 5:21 AM Waiman Long <llong@xxxxxxxxxx> wrote:
>
> On 7/26/21 1:03 PM, Boqun Feng wrote:
> > On Tue, Jul 27, 2021 at 12:41:34AM +0800, Guo Ren wrote:
> >> On Mon, Jul 26, 2021 at 6:39 PM Boqun Feng <boqun.feng@xxxxxxxxx> wrote:
> >>> On Mon, Jul 26, 2021 at 04:56:49PM +0800, Huacai Chen wrote:
> >>>> Hi, Geert,
> >>>>
> >>>> On Mon, Jul 26, 2021 at 4:36 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> >>>>> Hi Huacai,
> >>>>>
> >>>>> On Sat, Jul 24, 2021 at 2:36 PM Huacai Chen <chenhuacai@xxxxxxxxxxx> wrote:
> >>>>>> Introduce a new Kconfig option ARCH_HAS_HW_XCHG_SMALL, which means arch
> >>>>>> has hardware sub-word xchg/cmpxchg support. This option will be used as
> >>>>>> an indicator to select the bit-field definition in the qspinlock data
> >>>>>> structure.
> >>>>>>
> >>>>>> Signed-off-by: Huacai Chen <chenhuacai@xxxxxxxxxxx>
> >>>>> Thanks for your patch!
> >>>>>
> >>>>>> --- a/arch/Kconfig
> >>>>>> +++ b/arch/Kconfig
> >>>>>> @@ -228,6 +228,10 @@ config ARCH_HAS_FORTIFY_SOURCE
> >>>>>>            An architecture should select this when it can successfully
> >>>>>>            build and run with CONFIG_FORTIFY_SOURCE.
> >>>>>>
> >>>>>> +# Select if arch has hardware sub-word xchg/cmpxchg support
> >>>>>> +config ARCH_HAS_HW_XCHG_SMALL
> >>>>> What do you mean by "hardware"?
> >>>>> Does a software fallback count?
> >>>> This new option is supposed as an indicator to select bit-field
> >>>> definition of qspinlock, software fallback is not helpful in this
> >>>> case.
> >>>>
> >>> I don't think this is true. IIUC, the rationale of the config is that
> >>> for some architectures, since the architectural cmpxchg doesn't provide
> >>> forward-progress guarantee then using cmpxchg of machine-word to
> >>> implement xchg{8,16}() may cause livelock, therefore these architectures
> >>> don't want to provide xchg{8,16}(), as a result they cannot work with
> >>> qspinlock when _Q_PENDING_BITS is 8.
> >>>
> >>> So as long as an architecture can provide and has already provided an
> >>> implementation of xchg{8,16}() which guarantee forward-progress (even
> >>> though the implementation is using a machine-word size cmpxchg), the
> >>> architecture doesn't need to select ARCH_HAS_HW_XCHG_SMALL.
> >> Seems only atomic could provide forward progress, isn't it? And lr/sc
> >> couldn't implement xchg/cmpxchg primitive properly.
> >>
> > I'm missing you point here, a) ll/sc can provide forward progress and b)
> > ll/sc instructions are used to implement xchg/cmpxchg (see ARM64 and
> > PPC).
> >
> >> How to make CPU guarantee  "load + cmpxchg" forward-progress? Fusion
> >> these instructions and lock the snoop channel?
> >> Maybe hardware guys would think that it's easier to implement cas +
> >> dcas + amo(short & byte).
> >>
> > Please note that if _Q_PENDING_BITS == 1, then the xchg_tail() is
> > implemented as a "load + cmpxchg", so if "load + cmpxchg" implementation
> > of xchg16() doesn't provide forward-progress in an architecture, neither
> > does xchg_tail().
>
> Agreed. The xchg_tail() for the "_Q_PENDING_BITS == 1" case is a
> software emulation of xchg16(). Pure software emulation like that does
> not provide forward progress guarantee. This is usually not a big
> problem for non-RT kernel for which occasional long latency is
> acceptable, but it is not good for RT kernel.
"How to implement xchg_tail" shouldn't force with _Q_PENDING_BITS, but
the arch could choose.
But this will raise another topic, is qspinlock suitable for these
arches? Maybe you've answered here: "for the non-RT kernel, Yes".

I remember you are the person who doesn't against the patch I've sent [1]:
[1] https://lore.kernel.org/linux-riscv/b6466a43-6fb3-dc47-e0ef-d493e0930ab2@xxxxxxxxxx/

>
> Cheers,
> Longman
>

-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/