Re: RCU ideas discussed at LPC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 03, 2020 at 06:31:33PM -0800, Paul E. McKenney wrote:
> On Fri, Jan 03, 2020 at 08:56:17PM -0500, Joel Fernandes wrote:
> > On Wed, Dec 25, 2019 at 05:05:32PM -0800, Paul E. McKenney wrote:
> > > On Wed, Dec 25, 2019 at 05:41:04PM -0500, Joel Fernandes wrote:
> > > > Hi Paul,
> > > > We were discussing some ideas on facebook so I wanted to just post
> > > > them here as well. This is in the context of the RCU section of RT MC
> > > > https://www.youtube.com/watch?v=bpyFQJV5gCI
> > > > 
> > > > Detecting high kfree_rcu() load
> > > > ----------
> > > > You mentioned about this. As I understand it, we did the kfree_rcu()
> > > > batching to let the system not do anything RCU related until a batch
> > > > has filled up enough or a timeout has occurred. This makes the GP
> > > > thread and the system do less work.
> > > > The problem you are raising in our facebook thread is, that during
> > > > heavy load the "batch" can be large and be dumped into call_rcu()
> > > > eventually. Wouldn't this be better handled generically within
> > > > call_rcu() itself, for the benefit of other non-kfree_rcu workloads?
> > > > That is if a large number of callbacks is dumped, then try to end the
> > > > GP more quickly. This likely doesn't need a signal from kfree_rcu()
> > > > since call_rcu() knows that it is being hammered.
> > > 
> > > Except that call_rcu() currently has no idea how many parcels of memory
> > > a given request from kfree_rcu() represents.
> > 
> > True. At the moment, neither does kfree_rcu() since we store only the
> > pointer. We could consult the low level allocator if they have this
> > information. If you could let me know how to make RCU more aggressive in this
> > case (once we know there's a problem), I could work on something like this. I
> > did have OOM issues in earlier versions of the kfree_rcu() patch. I could
> > boot a system with less memory and OOM it too with the tests even now.
> 
> Let's keep things simple, at first at least!  ;-)
> 
> Currently, call_rcu() has no idea how much memory is tied up by a normal
> callback, either.  But just counting the callbacks (or, in the case of
> kfree_rcu(), counting the block of memory, independent of size) is at
> least correlated with the memory footprint.  Plus that is what has been
> used in the past, so it should be a good place to start.
> 
> Besides, how many call_rcu() invocations is a 1K kfree_rcu() invocation
> worth?  A 8K kfree_rcu() invocation?  A 64-byte kfree_rcu() invocation?
> 
> We might need to answer those questions over time, but again, let's start
> simple.

Sounds great.

> > > > Detecting recursive call_rcu() within call_rcu()
> > > > ---------
> > > > We could use a per-cpu variable to detect a scenario like this, though
> > > > I am not sure if preemption during call_rcu() itself would cause false
> > > > positives.
> > > 
> > > A call_rcu() from within an RCU callback function is legal and is
> > > sometimes done.  Or are you thinking of a call_rcu() from an interrupt
> > > handler interrupting another call_rcu()?
> > 
> > Oh, did not know this. I thought this was the point heavily discussed in the
> > LPC talk but must have misunderstood when you said you hoped no one was
> > precisely doing this..
> 
> What I hoped they avoid is a call_rcu() bomb, where each callback does
> several call_rcu() invocations.  Just as with child processes invoking
> fork(), within broad limits it is OK for callback functions to invoke
> call_rcu().  There is at least one in rcutorture, for example, but it
> does just one call_rcu() and also checks a time-to-stop flag.

Ok, got it now.

> > > > ---------
> > > > How about doing this kind of call_rcu() to synchronize_rcu()
> > > > transition automatically if the context allows it? I.e. Detect the
> > > > context and if sleeping is allowed, then wait for the grace period
> > > > synchronously in call_rcu(). Not sure about deadlocks and the like
> > > > from this kind of waiting and have to think more.
> > > 
> > > This gets rather strange in a production PREEMPT=n build, so not a
> > > fan, actually.  And in real-time systems, I pretty much have to splat
> > > anyway if I slow down call_rcu() by that much.
> > > 
> > > So the preference is instead detecting such misconfiguration and issuing
> > > appropriate diagnostics.  And making RCU more able to keep up when not
> > > grossly misconfigured, hence the kfree_rcu() memory footprint being
> > > fed into core RCU.
> > 
> > Ok. Is it not Ok to simply assume that a large number of callbacks queued
> > along with observing high memory pressure, means RCU should be more
> > aggressive anyway since whatever memory can be freed by invoking callbacks
> > should be helpful anyway? Or were you thinking making RCU aggressive when
> > there's a lot of memory pressure is not worth it, without knowing that RCU is
> > the cause for it?
> 
> I used to have a memory-pressure switch for RCU, but the OOM guys hated
> it.  But given a reliable "running short of memory" indicator, I would
> be quite happy to use it.  After all, even if RCU is not at fault, it
> might still be helpful for it to pull its memory-footprint horns in a bit.

With recent advances in PSI, I am wondering if those pressure signals (for
memory) can be leveraged to pull the memory-footprint horns. I can look more
into this, I am also looking into PSI for other work things.

One thing I am wondering though is, say we get a reliable signal -- what
could RCU do? Were you thinking of having the FQS loop set the usual
emergency flags and hope the "RCU-idle" CPUs enter quiescent states, along
with additional signalling for rcu_read_unlock_special()?  Will think more
about it..

As far as testing goes, I was thinking of initially running rcuperf on a
system with less memory and never entering OOM as a "test has passed"
indication.

> > > > BTW, I have 2 interns working on RCU (Amol and Madupharna also on
> > > > CC).
> > > > They were selected among several others as a part of the
> > > > LinuxFoundation mentorship program. They are familiar with RCU. I have
> > > > asked them to look at some RCU-list work and RCU sparse work. However,
> > > > I can also have them look into a few other things as time permits and
> > > > depending on what interests them.
> > > 
> > > Dog paddling before cliff diving, please!  ;-)
> > 
> > Sure. They are working on relatively simpler things for their internship but
> > I just put these ideas out there with them on CC so they can pick something
> > else as well if they have time and interest ;-)
> 
> I considered pointing them at KCSAN reports, but about 5% of them require
> global knowledge.  And it is never clear up front which are the 5%.  And
> that 5% of "real bugs" is most of the motivation for things like KCSAN.

Interesting.

> > > > Thanks, Merry Christmas!
> > > 
> > > And to you and yours as well!
> > 
> > Hope you had a good holiday season!
> 
> It did!  First holiday season in quite a few years featuring all
> three kids, though not all at once.  Might be awhile until the next
> time that happens.  Something about them being about 30 years old and
> widely dispersed.  ;-)

Oh nice, happy to hear that and hope this year end brings the same.

> As the little one becomes more aware, your holiday seasons should become
> quite fun.  Don't miss out!  ;-)

Looking forward to it and will do ;)

thanks,

 - Joel




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux