Re: [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2

Mel Gorman <mgorman@xxxxxxx> · Tue, 17 Dec 2013 17:54:41 +0000

On Tue, Dec 17, 2013 at 03:42:14PM +0100, Ingo Molnar wrote:
> 
> * Mel Gorman <mgorman@xxxxxxx> wrote:
> 
> > [...]
> >
> > At that point it'll be time to look at profiles and see where we are 
> > actually spending time because the possibilities of finding things 
> > to fix through bisection will be exhausted.
> 
> Yeah.
> 
> One (heavy handed but effective) trick that can be used in such a 
> situation is to just revert everything that is causing problems, and 
> continue reverting until we get back to a v3.4 baseline performance.
> 

Very tempted but the potential timeframe here is very large and the number
of patches could be considerable. Some patches cause a lot of noise. For
example, one patch enabled ACPI cpufreq driver loading which looks like
a regression during that window but it's a side-effect that gets fixed
later. It'll take time to identify all the patches that potentially cause
problems.

> Once such a 'clean' tree (or queue of patches) is achived, that can be 
> used as a measurement base and the individual features can be 
> re-applied again, one by one, with measurement and analysis becoming a 
> lot easier.
> 

Ordinarily I would agree with you but would prefer a shorter window for
that type of strategy.

> > > Also it appears the Ebizzy numbers ought to be stable enough now 
> > > to make the range-TLB-flush measurements more precise?
> > 
> > Right now, the tlbflush microbenchmark figures look awful on the 
> > 8-core machine when the tlbflush shift patch and the schedule domain 
> > fix are both applied.
> 
> I think that furthr strengthens the case for the 'clean base' approach 
> I outlined above - but it's your call obviously ...
> 

I'll keep it as plan b if it cannot be fixed with a direct approach.

> Thanks again for going through all this. Tracking multi-commit 
> performance regressions across 1.5 years worth of commits is generally 
> very hard. Does your testing effort comes from enterprise Linux QA 
> testing, or did you ran into this problem accidentally?
> 

It does not come from enterprise Linux QA testing but it's motivated by
it. I want to catch as many "obvious" performance bugs before they do as
it saves time and stress in the long run. To assist that, I setup continual
performance regression testing and ebizzy was included in the first report
I opened.  It makes me worry what the rest of the reports contain.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>