Re: [RFC PATCH 00/15] Provide atomics and bitops implemented with ISO C++11 atomics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi David,

On Wed, May 18, 2016 at 04:10:37PM +0100, David Howells wrote:
> 
> Here's a set of patches to provide kernel atomics and bitops implemented
> with ISO C++11 atomic intrinsics.  The second part of the set makes the x86
> arch use the implementation.

As you know, I'm really not a big fan of this :)

Whilst you're seeing some advantages in using this on x86, I suspect
that's because the vast majority of memory models out there end up using
similar instructions sequences on that architecture (i.e. MOV and a very
occasional mfence). For weakly ordered architectures such as arm64, the
kernel memory model is noticeably different to that offered by C11 and
I'd be hesitant to map the two as you're proposing here, for the following
reasons:

  (1) C11's SC RMW operations are weaker than our full barrier atomics

  (2) There is no high quality implementation of consume loads, so we'd
      either need to continue using our existing rcu_deference code or
      be forced to use acquire loads

  (3) wmb/rmb don't exist in C11

  (4) We patch our atomics at runtime based on the CPU capabilites, since
      we require a single binary kernel Image

  (5) Even recent versions of GCC have been found to have serious issues
      generating correct (let alone performant) code [1]

  (6) If we start mixing and patching C11 atomics with homebrew atomics
      in an attempt to address some of the issues above, we open ourselves
      up to potential data races (i.e. undefined behaviour), but I doubt
      existing compilers actually manage to detect this.

Now, given all of that, you might be surprised to hear that I'm not
completely against some usage of C11 atomics in the kernel! What I think
would work quite nicely is defining an asm-generic interface built solely
out of the C11 _relaxed atomics and SC fences. Would it be efficient? Almost
certainly not. Would it be useful for new architecture ports to get up and
running quickly? Definitely.

In my opinion, if an architecture wants to go further than that (like you've
proposed here), then the code should be entirely confined to the relevant
arch/ directory and not advertised as a general, portable mapping between
the memory models.

Will

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69875
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux