On Wed, Jan 23, 2019 at 08:51:03AM +0800, Qu Wenruo wrote: > > > On 2019/1/17 上午10:25, Dave Chinner wrote: > > On Thu, Jan 17, 2019 at 09:30:19AM +0800, Qu Wenruo wrote: > >> On 2019/1/17 上午8:16, Dave Chinner wrote: > >>> On Wed, Jan 16, 2019 at 12:47:21PM +0800, Qu Wenruo wrote: > >>>> E.g. one operation should finish in 30s, but when it takes over 300s, > >>>> it's definitely a big regression. > >>>> > >>>> But considering how many different hardware/VM the test may be run on, > >>>> I'm not really confident if this is possible. > >>> > >>> You can really only determine performance regressions by comparing > >>> test runtime on kernels with the same features set run on the same > >>> hardware. Hence you'll need to keep archives from all your test > >>> machiens and configs and only compare between matching > >>> configurations. > >> > >> Thanks, this matches my current understanding of how the testsuite works. > >> > >> It looks like such regression detection can only be implemented outside > >> of fstests. > > > > That's pretty much by design. Analysis of multiple test run results > > and post-processing them is really not something that the test > > harness does. The test harness really just runs the tests and > > records the results.... > > What about using some other telemetry other than time to determine > regreesion? > > In my particular case, the correct behavior, some reading like > generation would only increase by a somewhat predictable number. > > While when the regression happens, the generation will go way higher > than expectation. That's something that would be done inside the test, right? i.e. this has nothing to do with the test harness itself, but is a failure criteria for the specific test? > Is it acceptable to craft a test case using such measurement? If it's reliable and not prone to false positives from future code changes, yes. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx