On Fri, Jun 18, 2021, at 7:53 PM, Nicholas Piggin wrote: > Excerpts from Andy Lutomirski's message of June 18, 2021 9:49 am: > > On 6/16/21 11:51 PM, Nicholas Piggin wrote: > >> Excerpts from Andy Lutomirski's message of June 17, 2021 3:32 pm: > >>> On Wed, Jun 16, 2021, at 7:57 PM, Andy Lutomirski wrote: > >>>> > >>>> > >>>> On Wed, Jun 16, 2021, at 6:37 PM, Nicholas Piggin wrote: > >>>>> Excerpts from Andy Lutomirski's message of June 17, 2021 4:41 am: > >>>>>> On 6/16/21 12:35 AM, Peter Zijlstra wrote: > >>>>>>> On Wed, Jun 16, 2021 at 02:19:49PM +1000, Nicholas Piggin wrote: > >>>>>>>> Excerpts from Andy Lutomirski's message of June 16, 2021 1:21 pm: > >>>>>>>>> membarrier() needs a barrier after any CPU changes mm. There is currently > >>>>>>>>> a comment explaining why this barrier probably exists in all cases. This > >>>>>>>>> is very fragile -- any change to the relevant parts of the scheduler > >>>>>>>>> might get rid of these barriers, and it's not really clear to me that > >>>>>>>>> the barrier actually exists in all necessary cases. > >>>>>>>> > >>>>>>>> The comments and barriers in the mmdrop() hunks? I don't see what is > >>>>>>>> fragile or maybe-buggy about this. The barrier definitely exists. > >>>>>>>> > >>>>>>>> And any change can change anything, that doesn't make it fragile. My > >>>>>>>> lazy tlb refcounting change avoids the mmdrop in some cases, but it > >>>>>>>> replaces it with smp_mb for example. > >>>>>>> > >>>>>>> I'm with Nick again, on this. You're adding extra barriers for no > >>>>>>> discernible reason, that's not generally encouraged, seeing how extra > >>>>>>> barriers is extra slow. > >>>>>>> > >>>>>>> Both mmdrop() itself, as well as the callsite have comments saying how > >>>>>>> membarrier relies on the implied barrier, what's fragile about that? > >>>>>>> > >>>>>> > >>>>>> My real motivation is that mmgrab() and mmdrop() don't actually need to > >>>>>> be full barriers. The current implementation has them being full > >>>>>> barriers, and the current implementation is quite slow. So let's try > >>>>>> that commit message again: > >>>>>> > >>>>>> membarrier() needs a barrier after any CPU changes mm. There is currently > >>>>>> a comment explaining why this barrier probably exists in all cases. The > >>>>>> logic is based on ensuring that the barrier exists on every control flow > >>>>>> path through the scheduler. It also relies on mmgrab() and mmdrop() being > >>>>>> full barriers. > >>>>>> > >>>>>> mmgrab() and mmdrop() would be better if they were not full barriers. As a > >>>>>> trivial optimization, mmgrab() could use a relaxed atomic and mmdrop() > >>>>>> could use a release on architectures that have these operations. > >>>>> > >>>>> I'm not against the idea, I've looked at something similar before (not > >>>>> for mmdrop but a different primitive). Also my lazy tlb shootdown series > >>>>> could possibly take advantage of this, I might cherry pick it and test > >>>>> performance :) > >>>>> > >>>>> I don't think it belongs in this series though. Should go together with > >>>>> something that takes advantage of it. > >>>> > >>>> I’m going to see if I can get hazard pointers into shape quickly. > >>> > >>> Here it is. Not even boot tested! > >>> > >>> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=sched/lazymm&id=ecc3992c36cb88087df9c537e2326efb51c95e31 > >>> > >>> Nick, I think you can accomplish much the same thing as your patch by: > >>> > >>> #define for_each_possible_lazymm_cpu while (false) > >> > >> I'm not sure what you mean? For powerpc, other CPUs can be using the mm > >> as lazy at this point. I must be missing something. > > > > What I mean is: if you want to shoot down lazies instead of doing the > > hazard pointer trick to track them, you could do: > > > > #define for_each_possible_lazymm_cpu while (false) > > > > which would promise to the core code that you don't have any lazies left > > by the time exit_mmap() is done. You might need a new hook in > > exit_mmap() depending on exactly how you implement the lazy shootdown. > > Oh for configuring it away entirely. I'll have to see how it falls out, > I suspect we'd want to just no-op that entire function and avoid the 2 > atomics if we are taking care of our lazy mms with shootdowns. Do you mean the smp_store_release()? On x86 and similar architectures, that’s almost free. I’m also not convinced it needs to be a real release. > > The more important thing would be the context switch fast path, but even > there, there's really no reason why the two approaches couldn't be made > to both work with some careful helper functions or structuring of the > code. > > Thanks, > Nick >