Re: [PATCH v2] parisc: Rewrite light-weight syscall and futex code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2021-12-23 2:58 p.m., Helge Deller wrote:
On 12/23/21 20:31, Helge Deller wrote:
On 12/23/21 20:17, John David Anglin wrote:
On 2021-12-23 1:47 p.m., Helge Deller wrote:
...
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
...
diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h
index 9cd4dd6e63ad..8f97db995b04 100644
--- a/arch/parisc/include/asm/futex.h
+++ b/arch/parisc/include/asm/futex.h
...
+static inline bool _futex_force_interruptions(unsigned long ua)
+{
+    bool result;
+
+    __asm__ __volatile__(
+        "1:\tldw 0(%1), %0\n"
+        "2:\tstbys,e %%r0, 0(%1)\n"
+        "\tcomclr,= %%r0, %%r0, %0\n"
+        "3:\tldi 1, %0\n"
+        ASM_EXCEPTIONTABLE_ENTRY(1b, 3b)
+        ASM_EXCEPTIONTABLE_ENTRY(2b, 3b)
+        : "=&r" (result) : "r" (ua) : "memory"
+    );
+    return result;
I wonder if we can get rid of the comclr,= instruction by using
ASM_EXCEPTIONTABLE_ENTRY_EFAULT instead of ASM_EXCEPTIONTABLE_ENTRY,
e.g.:

diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h
index 8f97db995b04..ea052f013865 100644
--- a/arch/parisc/include/asm/futex.h
+++ b/arch/parisc/include/asm/futex.h
@@ -21,20 +21,21 @@ static inline unsigned long _futex_hash_index(unsigned long ua)
    * if load and store fault.
    */

-static inline bool _futex_force_interruptions(unsigned long ua)
+static inline unsigned long _futex_force_interruptions(unsigned long ua)
   {
-    bool result;
+    register unsigned long error __asm__ ("r8") = 0;
+    register unsigned long temp;

       __asm__ __volatile__(
-        "1:\tldw 0(%1), %0\n"
-        "2:\tstbys,e %%r0, 0(%1)\n"
-        "\tcomclr,= %%r0, %%r0, %0\n"
-        "3:\tldi 1, %0\n"
-        ASM_EXCEPTIONTABLE_ENTRY(1b, 3b)
-        ASM_EXCEPTIONTABLE_ENTRY(2b, 3b)
-        : "=&r" (result) : "r" (ua) : "memory"
+        "1:\tldw 0(%2), %0\n"
+        "2:\tstbys,e %%r0, 0(%2)\n"
+        "3:\n"
+        ASM_EXCEPTIONTABLE_ENTRY_EFAULT(1b, 3b)
+        ASM_EXCEPTIONTABLE_ENTRY_EFAULT(2b, 3b)
+        : "=r" (temp), "=r" (error)
+        : "r" (ua), "1" (error) : "memory"
       );
-    return result;
+    return error;
   }
I don't think this is a win.

1) Register %r8 needs to get loaded with 0. So, that's one instruction.
True. On the other hand you don't need the "ldi 1,%0" then, which brings parity.

2) Register %r8 is a caller saves register. Using it will cause gcc to save and restore it from stack. This may
cause a stack frame to be created when one isn't needed. The save and restore instructions are more
expensive, particularly if they cause a TLB miss.
Because of this reason, wouln't it make sense to switch the uaccess functions away from r8 too,
and use another temp register in both?

Note that the comclr both clears result and nullifies the following ldi instruction, so it is not executed in the fast path.
Yes, but when emulating with qemu, it generates comparism and jumps, while the loading of r8 (see 1)), is trivial.

If we change the uaccess functions away from r8, then we can drop comclr (and save one instruction).
Independend of your patch I think we should go away in uacces from r8.
I did some testing (on 32bit kernel) and r29 seems good.
Opinion?
I'm thinking r28 is better.  It's standard return value and would work better in _futex_force_interruptions().

Dave

--
John David Anglin  dave.anglin@xxxxxxxx




[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux