On Thu, Dec 3, 2020 at 9:07 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote: > > membarrier()'s MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE is documented > as syncing the core on all sibling threads but not necessarily the > calling thread. This behavior is fundamentally buggy and cannot be used > safely. Suppose a user program has two threads. Thread A is on CPU 0 > and thread B is on CPU 1. Thread A modifies some text and calls > membarrier(MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE). Then thread B > executes the modified code. If, at any point after membarrier() decides > which CPUs to target, thread A could be preempted and replaced by thread > B on CPU 0. This could even happen on exit from the membarrier() > syscall. If this happens, thread B will end up running on CPU 0 without > having synced. > > In principle, this could be fixed by arranging for the scheduler to > sync_core_before_usermode() whenever switching between two threads in > the same mm if there is any possibility of a concurrent membarrier() > call, but this would have considerable overhead. Instead, make > membarrier() sync the calling CPU as well. > > As an optimization, this avoids an extra smp_mb() in the default > barrier-only mode. Fixes: 70216e18e519 ("membarrier: Provide core serializing command, *_SYNC_CORE") also: > + /* > + * For regular membarrier, we can save a few cycles by > + * skipping the current cpu -- we're about to do smp_mb() > + * below, and if we migrate to a different cpu, this cpu > + * and the new cpu will execute a full barrier in the > + * scheduler. > + * > + * For CORE_SYNC, we do need a barrier on the current cpu -- s/CORE_SYNC/SYNC_CORE/ --Andy