Re: [RFC PATCH v1 1/5] locking/atomic: Implement atomic_fetch_and_or

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, all,

On Sat, Jul 31, 2021 at 2:40 AM Waiman Long <llong@xxxxxxxxxx> wrote:
>
> On 7/29/21 6:18 AM, hev wrote:
> > Hi, Will,
> >
> > On Thu, Jul 29, 2021 at 5:39 PM Will Deacon <will@xxxxxxxxxx> wrote:
> >> On Wed, Jul 28, 2021 at 07:48:22PM +0800, Rui Wang wrote:
> >>> From: wangrui <wangrui@xxxxxxxxxxx>
> >>>
> >>> This patch introduce a new atomic primitive 'and_or', It may be have three
> >>> types of implemeations:
> >>>
> >>>   * The generic implementation is based on arch_cmpxchg.
> >>>   * The hardware supports atomic 'and_or' of single instruction.
> >> Do any architectures actually support this instruction?
> > No, I'm not sure now.
> >
> >> On arm64, we can clear arbitrary bits and we can set arbitrary bits, but we
> >> can't combine the two in a fashion which provides atomicity and
> >> forward-progress guarantees.
> >>
> >> Please can you explain how this new primitive will be used, in case there's
> >> an alternative way of doing it which maps better to what CPUs can actually
> >> do?
> > I think we can easily exchange arbitrary bits of a machine word with atomic
> > andnot_or/and_or. Otherwise, we can only use xchg8/16 to do it. It depends on
> > hardware support, and the key point is that the bits to be exchanged
> > must be in the
> > same sub-word. qspinlock adjusted memory layout for this reason, and waste some
> > bits(_Q_PENDING_BITS == 8).
>
> It is not actually a waste of bits. With _Q_PENDING_BITS==8, more
> optimized code can be used for pending bit processing. It is only in the
> rare case that NR_CPUS >= 16k - 1 that we have to fall back to
> _Q_PENDING_BITS==1. In fact, that should be the only condition that will
> make _Q_PENDING_BITS=1.
Our original goal is to let LoongArch (and CSKY, RISC-V, etc) can use
qspinlock, but these archs lack sub-word xchg/cmpxchg. Arnd suggests
we not use qspinlock, but LoongArch has large SMP (and NUMA) so we
need it. Peter suggests we implement atomic_fetch_and_or, but it seems
not agreed by everyone. So, I think we can only fix the
badly-implemented xchg_small() for MIPS and LoongArch.

Huacai
>
> Cheers,
> Longman
>



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux