Re: [PATCH v3 rcu-dev] rcuperf: Measure memory footprint during kfree_rcu() test

Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> · Wed, 15 Jan 2020 23:05:58 -0500

On Wed, Jan 15, 2020 at 04:01:04PM -0800, Paul E. McKenney wrote:
> On Wed, Jan 15, 2020 at 05:45:42PM -0500, Joel Fernandes wrote:
> > On Wed, Jan 15, 2020 at 02:42:51PM -0800, Paul E. McKenney wrote:
> > > > [snip]
> > > > > > We can certainly refine it further but at this time I am thinking of spending
> > > > > > my time reviewing Lai's patches and learning some other RCU things I need to
> > > > > > catch up on. If you hate this patch too much, we can also defer this patch
> > > > > > review for a bit and I can carry it in my tree for now as it is only a patch
> > > > > > to test code. But honestly, in its current form I am sort of happy with it.
> > > > > 
> > > > > OK, I will keep it as is for now and let's look again later on.  It is not
> > > > > in the bucket for the upcoming merge window in any case, so we do have
> > > > > quite a bit of time.
> > > > > 
> > > > > It is not that I hate it, but rather that I want to be able to give
> > > > > good answers to questions that might come up.  And given that I have
> > > > > occasionally given certain people a hard time about their statistics,
> > > > > it is only reasonable to expect them to return the favor.  I wouldn't
> > > > > want you to be caught in the crossfire.  ;-)
> > > > 
> > > > Since the weights were concerning, I was thinking of just using a weight of
> > > > (1 / N) where N is the number of samples. Essentially taking the average.
> > > > That could be simple enough and does not cause your concerns with weight
> > > > tuning. I tested it and looks good, I'll post it shortly.
> > > 
> > > YES!!!  ;-)
> > > 
> > > Snapshot mem_begin before entering the loop.  For the mean value to
> > > be solid, you need at least 20-30 samples, which might mean upping the
> > > default for kfree_loops.  Have an "unsigned long long" to accumulate the
> > > sum, which should avoid any possibility of overflow for current systems
> > > and for all systems for kfree_loops less than PAGE_SIZE.  At which point,
> > > forget the "%" stuff and just sum up the si_mem_available() on each pass
> > > through the loop.
> > > 
> > > Do the division on exit from the loop, preferably checking for divide
> > > by zero.
> > > 
> > > Straightforward, fast, reasonably reliable, and easy to defend.
> > 
> > I mostly did it along these lines. Hopefully the latest posting is reasonable
> > enough ;-) I sent it twice because I messed up the authorship (sorry).
> 
> No problem with the authorship-fix resend!
> 
> But let's get this patch consistent with basic statistics!
> 
> You really do need 20-30 samples for an average to mean much.
> 
> Of course, right now you default kfree_loops to 10.  You are doing
> 8000 kmalloc()/kfree_rcu() loops on each pass.  This is large enough
> that just dropping the "% 4" should be just fine from the viewpoint of
> si_mem_available() overhead.  But 8000 allocations and frees should get
> done in way less than one second, so kicking the default kfree_loops up
> to 30 should be a non-problem.
> 
> Then the patch would be both simpler and statistically valid.
> 
> So could you please stop using me as the middleman in your fight with
> the laws of mathematics and get this patch to a defensible state?  ;-)

The thing is the signal doesn't vary much. I could very well just take one
out of the 4 samples and report that. But I still took the average since
there are 4 samples. I don't see much point in taking more samples here since
I am not concerned that the signal will fluctuate much (and if it really
does, then I can easily catch that kind of variation with multiple rcuperf
runs).

But if you really want though, I can increase the sampling to 20 samples or a
number like that and resend it.

thanks,

 - Joel