Re: [PATCH RFC] Avoid memory barrier in read_seqcount() through load acquire

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 8/13/24 14:26, Christoph Lameter via B4 Relay wrote:
From: "Christoph Lameter (Ampere)" <cl@xxxxxxxxxx>

Some architectures support load acquire which can save us a memory
barrier and save some cycles.

A typical sequence

	do {
		seq = read_seqcount_begin(&s);
		<something>
	} while (read_seqcount_retry(&s, seq);

requires 13 cycles on ARM64 for an empty loop. Two read memory barriers are
needed. One for each of the seqcount_* functions.

We can replace the first read barrier with a load acquire of
the seqcount which saves us one barrier.

On ARM64 doing so reduces the cycle count from 13 to 8.

Signed-off-by: Christoph Lameter (Ampere) <cl@xxxxxxxxxx>
---
  arch/Kconfig            |  5 +++++
  arch/arm64/Kconfig      |  1 +
  include/linux/seqlock.h | 41 +++++++++++++++++++++++++++++++++++++++++
  3 files changed, 47 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 975dd22a2dbd..3f8867110a57 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1600,6 +1600,11 @@ config ARCH_HAS_KERNEL_FPU_SUPPORT
  	  Architectures that select this option can run floating-point code in
  	  the kernel, as described in Documentation/core-api/floating-point.rst.
+config ARCH_HAS_ACQUIRE_RELEASE
+	bool
+	help
+	  Architectures that support acquire / release can avoid memory fences
+
  source "kernel/gcov/Kconfig"
source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a2f8ff354ca6..19e34fff145f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -39,6 +39,7 @@ config ARM64
  	select ARCH_HAS_PTE_DEVMAP
  	select ARCH_HAS_PTE_SPECIAL
  	select ARCH_HAS_HW_PTE_YOUNG
+	select ARCH_HAS_ACQUIRE_RELEASE
  	select ARCH_HAS_SETUP_DMA_OPS
  	select ARCH_HAS_SET_DIRECT_MAP
  	select ARCH_HAS_SET_MEMORY

Do we need a new ARCH flag? I believe barrier APIs like smp_load_acquire() will use the full barrier for those arch'es that don't define their own smp_load_acquire().

BTW, acquire/release can be considered memory barriers too. Maybe you are talking about preferring acquire/release barriers over read/write barriers. Right?

Cheers,
Longman





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux