On Wed, Sep 20, 2017 at 11:13 AM, Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote: > > ----- On Sep 20, 2017, at 12:02 PM, Andy Lutomirski luto@xxxxxxxxxx wrote: > > > On Sun, Sep 17, 2017 at 3:36 PM, Paul E. McKenney > > <paulmck@xxxxxxxxxxxxxxxxxx> wrote: > >> Hello! > >> > >> Rough notes from our discussion last Thursday. Please reply to the > >> group with any needed elaborations or corrections. > >> > >> Adding Andy and Michael on CC since this most closely affects their > >> architectures. Also adding Dave Watson and Maged Michael because > >> the preferred approach requires that processes wanting to use the > >> lightweight sys_membarrier() do a registration step. > > > > Not to be too much of a curmudgeon, but I think that there should be a > > real implementation of the isync membarrier before this get merged. > > This series purports to solve two problems, ppc barriers and x86 > > exit-without-isync, but it's very hard to evaluate whether it actually > > solves the latter problem given the complete lack of x86 or isync code > > in the current RFC. > > > > It still seems to me that you won't get any particular advantage for > > using this registration mechanism on x86 even when you implement > > isync. Unless I've misunderstood, the only real issue on x86 is that > > you need a helper like arch_force_isync_before_usermode(), and that > > helper doesn't presently exist. That means that this whole patchset > > is standing on very dangerous ground: you'll end up with an efficient > > implementation that works just fine without even requesting > > registration on every architecture except ppc. That way lies > > userspace bugs. > > My proposed RFC for private expedited membarrier enforces that all > architectures perform the registration step. Using the "PRIVATE_EXPEDITED" > command without prior process registration returns an error on all > architectures. The goal here is to make all architectures behave in the > same way, and it allows us to rely on process registration to deal > with future arch-specific optimizations. Fair enough. That being said, on same architectures (which may well be all but PPC), it might be nice if the registration call literally just sets a flag in the mm saying that it happened so that the registration enforcement can be done. > > > Adding the "core_sync" behavior could then be done for the next kernel > merge window. I'm currently foreseeing two possible ABI approaches to > expose it: > > Approach 1: > > Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE and > MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE commands. This > allows us to return their availability through MEMBARRIER_CMD_QUERY. > > Approach 2: > > Add a "MEMBARRIER_FLAG_SYNC_CORE" as flag parameter. It could be set > when issuing both MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED and > MEMBARRIER_CMD_PRIVATE_EXPEDITED, thus ensuring core serializing > behavior. Querying whether core serialization is supported could > be done by issuing the MEMBARRIER_CMD_QUERY command with the > MEMBARRIER_FLAG_SYNC_CORE flag set. > > Any other ideas ? Any approach seems better ? It doesn't seem to make much difference to me.