On Wed, Jun 26, 2019 at 10:20:05PM +0200, Thomas Gleixner wrote: > On Tue, 18 Jun 2019, Fenghua Yu wrote: > > + > > +static atomic_t split_lock_debug; > > + > > +void split_lock_disable(void) > > +{ > > + /* Disable split lock detection on this CPU */ > > + this_cpu_and(msr_test_ctl_cached, ~MSR_TEST_CTL_SPLIT_LOCK_DETECT); > > + wrmsrl(MSR_TEST_CTL, this_cpu_read(msr_test_ctl_cached)); > > + > > + /* > > + * Use the atomic variable split_lock_debug to ensure only the > > + * first CPU hitting split lock issue prints one single complete > > + * warning. This also solves the race if the split-lock #AC fault > > + * is re-triggered by NMI of perf context interrupting one > > + * split-lock warning execution while the original WARN_ONCE() is > > + * executing. > > + */ > > + if (atomic_cmpxchg(&split_lock_debug, 0, 1) == 0) { > > + WARN_ONCE(1, "split lock operation detected\n"); > > + atomic_set(&split_lock_debug, 0); > > What's the purpose of this atomic_set()? atomic_set() releases the split_lock_debug flag after WARN_ONCE() is done. The same split_lock_debug flag will be used in sysfs write for atomic operation as well, as proposed by Ingo in https://lkml.org/lkml/2019/4/25/48 So that's why the flag needs to be cleared, right? > > > +dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code) > > +{ > > + unsigned int trapnr = X86_TRAP_AC; > > + char str[] = "alignment check"; > > + int signr = SIGBUS; > > + > > + RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU"); > > + > > + if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) == NOTIFY_STOP) > > + return; > > + > > + cond_local_irq_enable(regs); > > + if (!user_mode(regs) && static_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) { > > + /* > > + * Only split locks can generate #AC from kernel mode. > > + * > > + * The split-lock detection feature is a one-shot > > + * debugging facility, so we disable it immediately and > > + * print a warning. > > + * > > + * This also solves the instruction restart problem: we > > + * return the faulting instruction right after this it > > we return the faulting instruction ... to the store so we get our deposit > back :) > > the fault handler returns to the faulting instruction which will be then > executed without .... > > Don't try to impersonate code, cpus or whatever. It doesn't make sense and > confuses people. > > > + * will be executed without generating another #AC fault > > + * and getting into an infinite loop, instead it will > > + * continue without side effects to the interrupted > > + * execution context. > > That last part 'instead .....' is redundant. It's entirely clear from the > above that the faulting instruction is reexecuted .... > > Please write concise comments and do try to repeat the same information > with a different painting. I copied the comment completely from Ingo's comment on v8: https://lkml.org/lkml/2019/4/25/40 Thanks. -Fenghua