Re: [RFC][PATCH] spin loop arch primitives for busy waiting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 06, 2017 at 10:31:46AM -0700, Linus Torvalds wrote:
> And we'd probably want to make it even more strict, in that soem mwait
> implementations might simply not be very good for short waits.

Yeah, we need to find something that works; assuming its beneficial at
all on modern chips.

> > But it builds and boots, no clue if its better or worse. Changing mwait
> > eax to 0 would give us C1 and might also be worth a try I suppose.
> 
> Hmm. Also:
> 
> > +               ___mwait(0xf0 /* C0 */, 0x01 /* INT */);                \
> 
> Do you actually want that "INT" bit set? It's only meaningful if
> interrupts are _masked_, afaik. Which doesn't necessarily make sense
> for this case.

Correct, in fact its quite broken. Corrected.

> But maybe "monitor" is really cheap. I suspect it's microcoded,
> though, which implies "no".

Something like so then. According to the SDM mwait is a no-op if we do
not execute monitor first. So this variant should get the first
iteration without expensive instructions.

And this variant is actually quite amenable to alternative(), with which
we can pick between PAUSE and this.

---
 arch/x86/include/asm/barrier.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index bfb28ca..2b5d5a2 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -80,6 +80,36 @@ do {									\
 #define __smp_mb__before_atomic()	barrier()
 #define __smp_mb__after_atomic()	barrier()
 
+static inline void ___monitor(const void *eax, unsigned long ecx,
+			     unsigned long edx)
+{
+	/* "monitor %eax, %ecx, %edx;" */
+	asm volatile(".byte 0x0f, 0x01, 0xc8;"
+		     :: "a" (eax), "c" (ecx), "d"(edx));
+}
+
+static inline void ___mwait(unsigned long eax, unsigned long ecx)
+{
+	/* "mwait %eax, %ecx;" */
+	asm volatile(".byte 0x0f, 0x01, 0xc9;"
+		     :: "a" (eax), "c" (ecx));
+}
+
+#define smp_cond_load_acquire(ptr, cond_expr)				\
+({									\
+	typeof(ptr) __PTR = (ptr);					\
+	typeof(*ptr) VAL;						\
+	for (;;) {							\
+		VAL = READ_ONCE(*__PTR);				\
+		if (cond_expr)						\
+			break;						\
+		___mwait(0xf0 /* C0 */, 0);				\
+		___monitor(__PTR, 0, 0);				\
+	}								\
+	smp_acquire__after_ctrl_dep();					\
+	VAL;								\
+})
+
 #include <asm-generic/barrier.h>
 
 #endif /* _ASM_X86_BARRIER_H */



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux