* Mathieu Desnoyers: > ----- On Dec 13, 2021, at 2:29 PM, Florian Weimer fweimer@xxxxxxxxxx wrote: > >> * Mathieu Desnoyers: >> >>>> Could it fall back to >>>> MEMBARRIER_CMD_GLOBAL instead? >>> >>> No. CMD_GLOBAL does not issue the required rseq fence used by the >>> algorithm discussed. Also, CMD_GLOBAL has quite a few other shortcomings: >>> it takes a while to execute, and is incompatible with nohz_full kernels. >> >> What about using sched_setcpu to move the current thread to the same CPU >> (and move it back afterwards)? Surely that implies the required sort of >> rseq barrier that MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ with >> MEMBARRIER_CMD_FLAG_CPU performs? > > I guess you refer to using sched_setaffinity(2) there ? There are various > reasons why this may fail. For one, the affinity mask is a shared global > resource which can be changed by external applications. So is process memory … > Also, setting the affinity is really just a hint. In the presence of > cpu hotplug and or cgroup cpuset, it is known to lead to situations > where the kernel just gives up and provides an affinity mask including > all CPUs. How does CPU hotplug impact this negatively? The cgroup cpuset issue clearly is a bug. > Therefore, using sched_setaffinity() and expecting to be pinned to > a specific CPU for correctness purposes seems brittle. I'm pretty sure it used to work reliably for some forms of concurrency control. > But _if_ we'd have something like a sched_setaffinity which we can > trust, yes, temporarily migrating to the target CPU, and observing that > we indeed run there, would AFAIU provide the same guarantee as the rseq > fence provided by membarrier. It would have a higher overhead than > membarrier as well. Presumably a signal could do it as well. >> That is possible even without membarrier, so I wonder why registration >> of intent is needed for MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ. > > I would answer that it is not possible to do this _reliably_ today > without membarrier (see above discussion of cpu hotplug, cgroups, and > modification of cpu affinity by external processes). > > AFAIR, registration of intent for MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ > is mainly there to provide a programming model similar to private expedited > plain and core-sync cmds. > > The registration of intent allows the kernel to further tweak what is > done internally and make tradeoffs which only impact applications > performing the registration. But if there is no strong performance argument to do so, this introduces additional complexity into userspace. Surely we could say we just do MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ at process start and document failure (in case of seccomp etc.), but then why do this at all? >>> In order to make sure the programming model is the same for expedited >>> private/global plain/sync-core/rseq membarrier commands, we require that >>> each process perform a registration beforehand. >> >> Hmm. At least it's not possible to unregister again. >> >> But I think it would be really useful to have some of these barriers >> available without registration, possibly in a more expensive form. > > What would be wrong with doing a membarrier private-expedited-rseq > registration on libc startup, and exposing a glibc tunable to allow > disabling this ? The configurations that need to be supported go from “no rseq“/“rseq” to “no rseq“/“rseq”/“rseq with membarrier”. Everyone now needs to think about implementing support for all three instead just the obvious two. Thanks, Florian