----- On Sep 21, 2017, at 9:15 AM, Peter Zijlstra peterz@xxxxxxxxxxxxx wrote: > On Wed, Sep 20, 2017 at 06:13:50PM +0000, Mathieu Desnoyers wrote: >> My proposed RFC for private expedited membarrier enforces that all >> architectures perform the registration step. Using the "PRIVATE_EXPEDITED" >> command without prior process registration returns an error on all >> architectures. The goal here is to make all architectures behave in the >> same way, and it allows us to rely on process registration to deal >> with future arch-specific optimizations. >> >> Adding the "core_sync" behavior could then be done for the next kernel >> merge window. I'm currently foreseeing two possible ABI approaches to >> expose it: >> >> Approach 1: >> >> Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE and >> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE commands. This >> allows us to return their availability through MEMBARRIER_CMD_QUERY. >> >> Approach 2: >> >> Add a "MEMBARRIER_FLAG_SYNC_CORE" as flag parameter. It could be set >> when issuing both MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED and >> MEMBARRIER_CMD_PRIVATE_EXPEDITED, thus ensuring core serializing >> behavior. Querying whether core serialization is supported could >> be done by issuing the MEMBARRIER_CMD_QUERY command with the >> MEMBARRIER_FLAG_SYNC_CORE flag set. >> >> Any other ideas ? Any approach seems better ? > > So we really need another FLAG for that? AFAICT the current > PRIVATE_EXPEDITED is already sufficient for the cross modifying code, > since the IPI triggers an exception return on all currently running CPUs > and the future running CPUs will have the return to userspace doing the > exception return. > > The only issue is Andy fudging our x86 ret-to-userspace to not use IRET, > which we can fix by forcing it into the slowpath (that needs to exist > anyway) using that new TIF flag. I agree that x86, as it stands today, would provide core serialization with the private expedited membarrier command. And we can deal with future optimization of ret-to-userspace using the TIF flag set on registration. I'm wondering whether all architectures guarantee core serialization on return from interrupt triggered by the IPI, and on ret-to-userspace ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com