On Mon, Apr 06, 2020 at 08:15:56PM +0300, Jarkko Sakkinen wrote: > On Mon, Apr 06, 2020 at 08:10:29PM +0300, Jarkko Sakkinen wrote: > > On Mon, Apr 06, 2020 at 07:36:38AM -0700, Sean Christopherson wrote: > > > On Sat, Apr 04, 2020 at 04:12:02AM +0300, Jarkko Sakkinen wrote: > > > > On Fri, Apr 03, 2020 at 04:42:39PM -0700, Sean Christopherson wrote: > > > > > On Fri, Apr 03, 2020 at 12:35:50PM +0300, Jarkko Sakkinen wrote: > > > > > > From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > > > > > @@ -221,12 +224,16 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm) > > > > > > return ret; > > > > > > } > > > > > > > > > > > > + /* > > > > > > + * The page reclaimer uses list version for synchronization instead of > > > > > > + * synchronize_scru() because otherwise we could conflict with > > > > > > + * dup_mmap(). > > > > > > + */ > > > > > > spin_lock(&encl->mm_lock); > > > > > > list_add_rcu(&encl_mm->list, &encl->mm_list); > > > > > > > > > > You dropped the smp_wmb(). > > > > > > > > As I said to you in my review x86 pipeline does not reorder writes. > > > > > > And as I pointed out in this thread, smp_wmb() is a _compiler_ barrier if > > > and only if CONFIG_SMP=y. The compiler can reorder list_add_rcu() and > > > mm_list_version++ because from it's perspective there is no dependency > > > between the two. And that's entirely true except for the SMP case where > > > the consumer of mm_list_version is relying on the list to be updated before > > > the version changes. > > > > I see. > > > > So why not change the variable volatile given that x86 is the only > > arch that this code gets used? > > Please note that I'm fully aware of > > https://www.kernel.org/doc/html/latest/process/volatile-considered-harmful.html > > Just wondering. Anyway, I'll add smp_wmb() back since it is safe play > in terms of acceptance. Because volatile is overkill and too heavy-handed for what is needed here. E.g. if SMP=n, then there are no barriers needed whatsover, as a CPU can't race with itself. Even for SMP=y, volatile is too much as it prevents optimization that would otherwise be allowed. For the smp_wmb() it doesn't matter much, if at all, since the list and version are updated inside of a critical section, i.e. barriered on both sides so the compiler can't do much optimization anyways. But for the smp_rmb() case, the compiler is free to cache the version in a register early on in the function, it just needs to make sure that the access is done before starting the list walk. If encl->mm_list_version were volatile, the compiler would not be allowed to do such an optimization as it'd be required to access memory exactly when it's referenced in code. This is easily visible in the compiled code, as encl->mm_list_version is only read from memory once per iteration (in the unlikely case that there are multiple iterations). It takes the do-while loop, which appears to read encl->mm_list_version twice per iteration: do { mm_list_version = encl->mm_list_version; <walk list> } while (unlikely(encl->mm_list_version != mm_list_version)); And turns it into an optimized loop that loads encl->mm_list_version the minimum number of times. If encl->mm_list_version list were volatile, the compiler would not be allowed to cache it in %rax. mov 0x58(%r12),%rax // Load from encl->mm_list_version lea 0x40(%r12),%rbx // Interleaved to optimize ALU vs LD and $0xfffffffffffff000,%rbp // Interleaved to optimize ALU vs LD mov %rax,0x8(%rsp) // mm_list_version = encl->mm_list_version walk_mm_list: <blah blah blah> mov 0x58(%r12),%rax // Load from encl->mm_list_version cmp 0x8(%rsp),%rax // Compare against mm_list_version jne update_mm_list_version <happy path> ret update_mm_list_version: mov %rax,0x8(%rsp) // mm_list_version = encl->mm_list_version jmpq 0xffffffff8102e460 <walk_mm_list>