On Fri, Dec 06, 2019 at 09:11:25AM -0800, Shakeel Butt wrote: > On Thu, Dec 5, 2019 at 6:10 PM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > If a shrinker is blocking for a long time, then we need to > > work to fix the shrinker implementation because blocking is a much > > bigger problem than just register/unregister. > > > > Yes, we should be fixing the implementations of all shrinkers and yes > it is bigger issue but we can also fix register/unregister isolation > issue in parallel. Fixing all shrinkers would a tedious and long task > and we should not block fixing isolation issue on it. "fixing all shrinkers" is a bit of hyperbole - you've identified only one instance where blocking is causing you problems. Indeed, most shrinkers are already non-blocking and won't cause you any problems at all. > > IOWs, we already know that cycling a global rwsem on every > > individual shrinker invocation is going to cause noticable > > scalability problems. Hence I don't think that this sort of "cycle > > the global rwsem faster to reduce [un]register latency" solution is > > going to fly because of the runtime performance regressions it will > > introduce.... > > > > I agree with your scalability concern (though others would argue to > first demonstrate the issue before adding more sophisticated scalable > code). Look at the git history. We *know* this is a problem, so anyone arguing that we have to prove it can go take a long walk of a short plank.... > Most memory reclaim code is written without the performance or > scalability concern, maybe we should switch our thinking. I think there's a lot of core mm and other developers that would disagree with you there. With respect to shrinkers, we've been directly concerned about performance and scalability of the individual instances as well as the infrastructure for at least the last decade.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx