On Fri, Jan 03, 2020 at 06:31:33PM -0800, Paul E. McKenney wrote: > On Fri, Jan 03, 2020 at 08:56:17PM -0500, Joel Fernandes wrote: > > On Wed, Dec 25, 2019 at 05:05:32PM -0800, Paul E. McKenney wrote: > > > On Wed, Dec 25, 2019 at 05:41:04PM -0500, Joel Fernandes wrote: > > > > Hi Paul, > > > > We were discussing some ideas on facebook so I wanted to just post > > > > them here as well. This is in the context of the RCU section of RT MC > > > > https://www.youtube.com/watch?v=bpyFQJV5gCI > > > > > > > > Detecting high kfree_rcu() load > > > > ---------- > > > > You mentioned about this. As I understand it, we did the kfree_rcu() > > > > batching to let the system not do anything RCU related until a batch > > > > has filled up enough or a timeout has occurred. This makes the GP > > > > thread and the system do less work. > > > > The problem you are raising in our facebook thread is, that during > > > > heavy load the "batch" can be large and be dumped into call_rcu() > > > > eventually. Wouldn't this be better handled generically within > > > > call_rcu() itself, for the benefit of other non-kfree_rcu workloads? > > > > That is if a large number of callbacks is dumped, then try to end the > > > > GP more quickly. This likely doesn't need a signal from kfree_rcu() > > > > since call_rcu() knows that it is being hammered. > > > > > > Except that call_rcu() currently has no idea how many parcels of memory > > > a given request from kfree_rcu() represents. > > > > True. At the moment, neither does kfree_rcu() since we store only the > > pointer. We could consult the low level allocator if they have this > > information. If you could let me know how to make RCU more aggressive in this > > case (once we know there's a problem), I could work on something like this. I > > did have OOM issues in earlier versions of the kfree_rcu() patch. I could > > boot a system with less memory and OOM it too with the tests even now. > > Let's keep things simple, at first at least! ;-) > > Currently, call_rcu() has no idea how much memory is tied up by a normal > callback, either. But just counting the callbacks (or, in the case of > kfree_rcu(), counting the block of memory, independent of size) is at > least correlated with the memory footprint. Plus that is what has been > used in the past, so it should be a good place to start. > > Besides, how many call_rcu() invocations is a 1K kfree_rcu() invocation > worth? A 8K kfree_rcu() invocation? A 64-byte kfree_rcu() invocation? > > We might need to answer those questions over time, but again, let's start > simple. Sounds great. > > > > Detecting recursive call_rcu() within call_rcu() > > > > --------- > > > > We could use a per-cpu variable to detect a scenario like this, though > > > > I am not sure if preemption during call_rcu() itself would cause false > > > > positives. > > > > > > A call_rcu() from within an RCU callback function is legal and is > > > sometimes done. Or are you thinking of a call_rcu() from an interrupt > > > handler interrupting another call_rcu()? > > > > Oh, did not know this. I thought this was the point heavily discussed in the > > LPC talk but must have misunderstood when you said you hoped no one was > > precisely doing this.. > > What I hoped they avoid is a call_rcu() bomb, where each callback does > several call_rcu() invocations. Just as with child processes invoking > fork(), within broad limits it is OK for callback functions to invoke > call_rcu(). There is at least one in rcutorture, for example, but it > does just one call_rcu() and also checks a time-to-stop flag. Ok, got it now. > > > > --------- > > > > How about doing this kind of call_rcu() to synchronize_rcu() > > > > transition automatically if the context allows it? I.e. Detect the > > > > context and if sleeping is allowed, then wait for the grace period > > > > synchronously in call_rcu(). Not sure about deadlocks and the like > > > > from this kind of waiting and have to think more. > > > > > > This gets rather strange in a production PREEMPT=n build, so not a > > > fan, actually. And in real-time systems, I pretty much have to splat > > > anyway if I slow down call_rcu() by that much. > > > > > > So the preference is instead detecting such misconfiguration and issuing > > > appropriate diagnostics. And making RCU more able to keep up when not > > > grossly misconfigured, hence the kfree_rcu() memory footprint being > > > fed into core RCU. > > > > Ok. Is it not Ok to simply assume that a large number of callbacks queued > > along with observing high memory pressure, means RCU should be more > > aggressive anyway since whatever memory can be freed by invoking callbacks > > should be helpful anyway? Or were you thinking making RCU aggressive when > > there's a lot of memory pressure is not worth it, without knowing that RCU is > > the cause for it? > > I used to have a memory-pressure switch for RCU, but the OOM guys hated > it. But given a reliable "running short of memory" indicator, I would > be quite happy to use it. After all, even if RCU is not at fault, it > might still be helpful for it to pull its memory-footprint horns in a bit. With recent advances in PSI, I am wondering if those pressure signals (for memory) can be leveraged to pull the memory-footprint horns. I can look more into this, I am also looking into PSI for other work things. One thing I am wondering though is, say we get a reliable signal -- what could RCU do? Were you thinking of having the FQS loop set the usual emergency flags and hope the "RCU-idle" CPUs enter quiescent states, along with additional signalling for rcu_read_unlock_special()? Will think more about it.. As far as testing goes, I was thinking of initially running rcuperf on a system with less memory and never entering OOM as a "test has passed" indication. > > > > BTW, I have 2 interns working on RCU (Amol and Madupharna also on > > > > CC). > > > > They were selected among several others as a part of the > > > > LinuxFoundation mentorship program. They are familiar with RCU. I have > > > > asked them to look at some RCU-list work and RCU sparse work. However, > > > > I can also have them look into a few other things as time permits and > > > > depending on what interests them. > > > > > > Dog paddling before cliff diving, please! ;-) > > > > Sure. They are working on relatively simpler things for their internship but > > I just put these ideas out there with them on CC so they can pick something > > else as well if they have time and interest ;-) > > I considered pointing them at KCSAN reports, but about 5% of them require > global knowledge. And it is never clear up front which are the 5%. And > that 5% of "real bugs" is most of the motivation for things like KCSAN. Interesting. > > > > Thanks, Merry Christmas! > > > > > > And to you and yours as well! > > > > Hope you had a good holiday season! > > It did! First holiday season in quite a few years featuring all > three kids, though not all at once. Might be awhile until the next > time that happens. Something about them being about 30 years old and > widely dispersed. ;-) Oh nice, happy to hear that and hope this year end brings the same. > As the little one becomes more aware, your holiday seasons should become > quite fun. Don't miss out! ;-) Looking forward to it and will do ;) thanks, - Joel