On Thu, May 07, 2020 at 10:09:03AM -0700, Paul E. McKenney wrote: > On Thu, May 07, 2020 at 01:00:06PM -0400, Johannes Weiner wrote: > > On Wed, May 06, 2020 at 05:55:35PM -0700, Andrew Morton wrote: > > > On Wed, 6 May 2020 17:42:40 -0700 "Paul E. McKenney" <paulmck@xxxxxxxxxx> wrote: > > > > > > > This commit adds a shrinker so as to inform RCU when memory is scarce. > > > > RCU responds by shifting into the same fast and inefficient mode that is > > > > used in the presence of excessive numbers of RCU callbacks. RCU remains > > > > in this state for one-tenth of a second, though this time window can be > > > > extended by another call to the shrinker. > > > > We may be able to use shrinkers here, but merely being invoked does > > not carry a reliable distress signal. > > > > Shrinkers get invoked whenever vmscan runs. It's a useful indicator > > for when to age an auxiliary LRU list - test references, clear and > > rotate or reclaim stale entries. The urgency, and what can and cannot > > be considered "stale", is encoded in the callback frequency and scan > > counts, and meant to be relative to the VM's own rate of aging: "I've > > tested X percent of mine for recent use, now you go and test the same > > share of your pool." It doesn't translate well to other > > interpretations of the callbacks, although people have tried. > > Would it make sense for RCU to interpret two invocations within (say) > 100ms of each other as indicating urgency? (Hey, I had to ask!) > > > > > If it proves feasible, a later commit might add a function call directly > > > > indicating the end of the period of scarce memory. > > > > > > (Cc David Chinner, who often has opinions on shrinkers ;)) > > > > > > It's a bit abusive of the intent of the slab shrinkers, but I don't > > > immediately see a problem with it. Always returning 0 from > > > ->scan_objects might cause a problem in some situations(?). > > > > > > Perhaps we should have a formal "system getting low on memory, please > > > do something" notification API. > > > > It's tricky to find a useful definition of what low on memory > > means. In the past we've used sc->priority cutoffs, the vmpressure > > interface (reclaimed/scanned - reclaim efficiency cutoffs), oom > > notifiers (another reclaim efficiency cutoff). But none of these > > reliably capture "distress", and they vary highly between different > > hardware setups. It can be hard to trigger OOM itself on fast IO > > devices, even when the machine is way past useful (where useful is > > somewhat subjective to the user). Userspace OOM implementations that > > consider userspace health (also subjective) are getting more common. > > > > > How significant is this? How much memory can RCU consume? > > > > I think if rcu can end up consuming a significant share of memory, one > > way that may work would be to do proper shrinker integration and track > > the age of its objects relative to the age of other allocations in the > > system. I.e. toss them all on a clock list with "new" bits and shrink > > them at VM velocity. If the shrinker sees objects with new bit set, > > clear and rotate. If it sees objects without them, we know rcu_heads > > outlive cache pages etc. and should probably cycle faster too. > > It would be easy for RCU to pass back (or otherwise use) the age of the > current grace period, if that would help. > > Tracking the age of individual callbacks is out of the question due to > memory overhead, but RCU could approximate this via statistical sampling. > Comparing this to grace-period durations could give information as to > whether making grace periods go faster would be helpful. > > But, yes, it would be better to have an elusive unambiguous indication > of distress. ;-) And I have dropped this patch for the time being, but I do hope that it served a purpose in illustrating that it is not difficult to put RCU into a fast-but-inefficient mode when needed. Thanx, Paul