Re: Cephalocon QA: Performance testing in teuthology

Mohamad Gebai <mgebai@xxxxxxx> · Thu, 5 Apr 2018 15:44:43 -0400

Thanks for starting this thread. This is something very interesting that
would be very useful to have. I had talked about this with a few people,
here's what we had in mind:

- Instead of having a pass/fail against a baseline, track the
performance over time. We can have the suite run periodically, and
trigger runs after a significant code change and at specific milestones

- Store performance values reported by the client; specifically, store
the percentiles for a better understanding of how the performance changes

- Store performance values reported by Ceph (from the perf dump?), these
might be less volatile than the ones reported by the client

- Use a database that allows easy querying those metrics. Jan had
suggested InfluxDB which is a time series DB and would allow querying
quite easy

- Graph the performance through versions of Ceph (through commits) so we
can find any regressions/improvements

Of course, for this to be relevant we'd need a setup and HW that doesn't
change.

Does that fit with what's suggested here?

Mohamad

On 04/04/2018 02:59 PM, Neha Ojha wrote:
> With the aim of doing automated performance testing using teuthology,
> we integrated a cbt task[1] into it. This task enables teuthology to
> run benchmarks like radosbench and librbdfio, by making use of the
> workloads and settings defined in the performance suite[2]. This suite
> runs as a part of the rados suite and the test results are stored in
> the teuthology archives in JSON format.
>
> The final aim is to pass/fail tests based on performance results, but
> we have faced a few challenges in this process.
>
> Determining reasonable baseline values for tests is difficult.
>
> - Teuthology applies different combination of configuration settings
> each time it runs these workloads, which creates a large sample space
> of configurations for us to track baselines for.
> - Variability of hardware on which the performance tests are run in the lab.
> - Repeatability of tests under the same conditions.
>
> Storing performance results.
>
> - Currently, the test results are stored in the teuthology archives.
> We have figured out a way to store these results longer than usual(~2
> weeks), but in the long run this may not be an ideal location.
> - +1 to Greg's idea of some kind of database system to feed these
> results into and do better analysis.
>
>
> We had a discussion at Cephalocon regarding the above, and based on
> the ideas that came up, we have attempted to solve some of the issues.
>
> Last week, we merged a minimal performance suite[3], that runs 4 basic
> jobs(subset of the perf suite) outside of the rados suite.
> This suite is now run as a part of the nightly teuthology runs on a
> specific set of machines(smithi) in the sepia lab, on the ceph master
> branch.
> Our aim here is to reduce the sample space of tests, and the
> variability around these tests, so that we can come up with baselines
> for this smaller subset.
> We already have a simple result analysis tool, which can be integrated
> with the cbt task to do the analysis and pass/fail tests based on
> configurable thresholds.
>
> We are also planning to expand the cbt task to cover rgw workloads.
>
> Something that will be very useful in the short term would be a way to
> easily view/compare the data collected in these nightly runs.
>
> [1] https://github.com/ceph/ceph/pull/17583
> [2] https://github.com/ceph/ceph/tree/master/qa/suites/rados/perf
> [3] https://github.com/ceph/ceph/tree/master/qa/suites/perf-basic
>
> On Wed, Apr 4, 2018 at 12:55 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>> Performance testing is an area that teuthology does not currently
>> address. Neha is doing some work around integrating cbt (ceph
>> benchmark tool, from Mark and other performance-interested people)
>> into teuthology so we can run some performance jobs. But there’s a lot
>> more work if we want to make long-term use of these to quantify our
>> changes in performance, rather than micro-targeting specific patches
>> in the lab. We’re concerned about random noise, machine variation, and
>> reproducibility of results; and we have no way to identify trends. In
>> short, we need some kind of database system to feed these results into
>> and do analysis. This would be a whole new competency for the
>> teuthology system and we’re not sure how best to go about it. But it’s
>> becoming a major concern.
>> PROBLEM TOPIC: how do we do performance testing and reliable analysis
>> in the lab?
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html