Re: lockless case of retain_dentry() (was Re: [PATCH 09/15] fold the call of retain_dentry() into fast_dput())

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 22 Nov 2023 at 09:20, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> If you want to do the optimization that the compiler misses by hand,
> it would be something like the attached patch.

Bah. Might as well do the reference decrements with the same logic,
not just the increments.

Of course, this is much more noticeable with the ticket locks, because
with the qspinlocks the "is this unlocked" test will check whether the
lock is all zeroes.

So with qspinlocks, the compiler sees that "oh, the low 32 bits are
zero", and the whole "merge the two words back to 64 bits" is much
cheaper, and doesn't generate quite the mess that it does for RISC-V
with ticket locks.

But this "treat the lockref as a 64-bit entity" thing is probably a
good thing on most 64-bit architectures, including x86 that has that
qspinlock thing.

Still not actually tested, but the code generation on x86 looks
reasonable, so it migth be worth looking at whether it helps the
RISC-V case.

                 Linus
diff --git a/lib/lockref.c b/lib/lockref.c
index 2afe4c5d8919..56f4419f593d 100644
--- a/lib/lockref.c
+++ b/lib/lockref.c
@@ -26,6 +26,17 @@
 	}									\
 } while (0)
 
+/*
+ * The compiler isn't smart enough to the the count
+ * increment in the high 32 bits of the 64-bit value,
+ * so do this optimization by hand.
+ */
+#if defined(__LITTLE_ENDIAN) && BITS_PER_LONG == 64
+ #define LOCKREF_ADD(n,x) ((n).lock_count += (unsigned long)(x)<<32)
+#else
+ #define LOCKREF_ADD(n,x) ((n).count += (unsigned long)(x)<<32)
+#endif
+
 #else
 
 #define CMPXCHG_LOOP(CODE, SUCCESS) do { } while (0)
@@ -42,7 +53,7 @@
 void lockref_get(struct lockref *lockref)
 {
 	CMPXCHG_LOOP(
-		new.count++;
+		LOCKREF_ADD(new,1);
 	,
 		return;
 	);
@@ -63,7 +74,7 @@ int lockref_get_not_zero(struct lockref *lockref)
 	int retval;
 
 	CMPXCHG_LOOP(
-		new.count++;
+		LOCKREF_ADD(new,1);
 		if (old.count <= 0)
 			return 0;
 	,
@@ -91,7 +102,7 @@ int lockref_put_not_zero(struct lockref *lockref)
 	int retval;
 
 	CMPXCHG_LOOP(
-		new.count--;
+		LOCKREF_ADD(new,-1);
 		if (old.count <= 1)
 			return 0;
 	,
@@ -119,7 +130,7 @@ EXPORT_SYMBOL(lockref_put_not_zero);
 int lockref_put_return(struct lockref *lockref)
 {
 	CMPXCHG_LOOP(
-		new.count--;
+		LOCKREF_ADD(new,-1);
 		if (old.count <= 0)
 			return -1;
 	,
@@ -137,7 +148,7 @@ EXPORT_SYMBOL(lockref_put_return);
 int lockref_put_or_lock(struct lockref *lockref)
 {
 	CMPXCHG_LOOP(
-		new.count--;
+		LOCKREF_ADD(new,-1);
 		if (old.count <= 1)
 			break;
 	,
@@ -174,7 +185,7 @@ int lockref_get_not_dead(struct lockref *lockref)
 	int retval;
 
 	CMPXCHG_LOOP(
-		new.count++;
+		LOCKREF_ADD(new,1);
 		if (old.count < 0)
 			return 0;
 	,

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux