Re: [ceph-users] First Reef release candidate - v18.1.0

Mark Nelson <mark.nelson@xxxxxxxxx> · Wed, 28 Jun 2023 08:00:23 -0500

Hi Michal!

On 6/27/23 18:02, Neha Ojha wrote:
Hi Michal,

Thank you for volunteering to help test the Reef release!

On Tue, Jun 27, 2023 at 6:44 AM Michal Strnad 
<michal.strnad@xxxxxxxxx> wrote:

    Hi everyone,

    We read that you are looking for ceph users who would be willing
    to help
    with performance testing of a new version of Ceph called Reef. We
    would
    like to volunteer and offer our assistance :-).

    Currently, we are setting up a large cluster consisting of fifty
    storage
    nodes, each with 24 rotational disks and 8 NVMe drives, some of which
    are designated for Bluestore and others for data purposes. Each of
    these
    machines is equipped with an AMD EPYC 7282 16-Core processor,
    ~314GB of
    memory, and a 2x25Gbps network connection. The network on each of
    these
    machines is used for both public and cluster communication, and if
    necessary, we can prioritize one over the other through QoS
    adjustments
    within the VLAN. However, we haven't had the need to do so thus far.

    Furthermore, we have sixteen application servers for monitors, MGR,
    metadata servers, and radosgw gateways. Each of these application
    servers is equipped with an AMD EPYC 7502 32-Core processor,
    ~250GB of
    memory, and a 2x25Gbps network connection.

    Both the storage and application servers are connected to two
    Nexus 9000
    switches with connectivity reaching several 100Gbps towards the
    internet.

    The mentioned cluster will be operational within a few weeks, with
    Ceph
    already installed and ready to undergo performance testing. Once
    this is
    ready, it will be possible to start testing the Reef version. We
    anticipate having approximately 2-3 weeks for testing. Are you
    interested in the performance results? To achieve better results, it
    would be beneficial to coordinate these tests in some way, so that we
    don't repeat what others have already tried. Could you please
    guide us
    on what specific aspects we should focus on, which parameters to
    test,
    and how to properly conduct the tests?

We are particularly interested to see the performance impact of the 
new RockDB version we'll be shipping with Reef. I am adding Mark to 
this email to provide guidance on performance tests.

As Neha mentioned, we both did a fairly major rocksdb upgrade and also 
changed Ceph's default bluestore RocksDB tunings after many years.  We 
*think* this is generally going to be an improvement, but there are some 
trade offs.  We expect lower write latency, higher write IOPS, and 
reduced CPU consumption in the kv sync thread (this is a bigger deal on 
all-flash setups), but there's a chance that we might also see higher 
write-amplification on the DB device in some scenarios.  In the testing 
I did I saw lower write amp in some tests and higher write amp in 
others.  Paul Cuzner also ran tests over at IBM and saw higher CPU usage 
during reads in some tests, though I think in the most recent version of 
the changes those improved and he also saw the write latency 
improvement.  His tests were primarily focused on low-medium load 
testing.  Initially I was doing max load testing but after seeing his 
results also ran some low-load tests.

I suspect that with HDD+Flash setups none of this is going to actually 
matter that much as the rotational latency of the HDDs is likely to be 
the dominating effect.  It would be nice to verify that though.  Given 
that this is a new cluster that you guys are setting up, we won't have 
any kind of historical data about existing RocksDB behavior to compare 
with.  We can still look at the general behavior of RocksDB though and 
see if there's anything obviously wrong with it.  I don't think we'll 
necessarily need cluster access unless something really interesting pops 
up, but perhaps Neha and Paul have something they want to look at.

As far as workloads go, I'd say if you guys have any real workloads you 
intend to do (veaam, scientific apps, etc) that might actually be the 
most worthwhile.  We do a lot of synthetic testing internally.  Real 
workloads on real clusters are harder for us to do ourselves.  Heavy 
write workloads (especially smaller writes or smaller objects) would 
probably be the most worthwhile for this round given the changes we're 
interested in looking at.

Mark

    After an agreement, it will be possible to arrange some form of
    access
    to the  machines, for example, by meeting via video conference and
    fine-tuning them together. Alternatively, we can also work on it
    through
    email, IRC, Slack, or any other suitable means.

We are coordinating community efforts around such testing in 
#ceph-at-scale slack channel in ceph-storage.slack.com 
<http://ceph-storage.slack.com>. I sent you an invite.

Thanks,
Neha

    Kind regards,
    Michal Strnad

    On 6/13/23 22:27, Neha Ojha wrote:
    > Hi everyone,
    >
    > This is the first release candidate for Reef.
    >
    > The Reef release comes with a new RockDB version (7.9.2) [0], which
    > incorporates several performance improvements and features. Our
    internal
    > testing doesn't show any side effects from the new version, but
    we are very
    > eager to hear community feedback on it. This is the first
    release to have
    > the ability to tune RockDB settings per column family [1], which
    allows for
    > more granular tunings to be applied to different kinds of data
    stored in
    > RocksDB. A new set of settings has been used in Reef to optimize
    > performance for most kinds of workloads with a slight penalty in
    some
    > cases, outweighed by large improvements in use cases such as
    RGW, in terms
    > of compactions and write amplification. We would highly
    encourage community
    > members to give these a try against their performance benchmarks
    and use
    > cases. The detailed list of changes in terms of RockDB and
    BlueStore can be
    > found in https://pad.ceph.com/p/reef-rc-relnotes.
    >
    > If any of our community members would like to help us with
    performance
    > investigations or regression testing of the Reef release
    candidate, please
    > feel free to provide feedback via email or in
    > https://pad.ceph.com/p/reef_scale_testing. For more active
    discussions,
    > please use the #ceph-at-scale slack channel in
    ceph-storage.slack.com <http://ceph-storage.slack.com>.
    >
    > Overall things are looking pretty good based on our testing.
    Please try it
    > out and report any issues you encounter. Happy testing!
    >
    > Thanks,
    > Neha
    >
    > Get the release from
    >
    > * Git at git://github.com/ceph/ceph.git
    <http://github.com/ceph/ceph.git>
    > * Tarball at https://download.ceph.com/tarballs/ceph-18.1.0.tar.gz
    > * Containers at https://quay.io/repository/ceph/ceph
    > * For packages, see
    https://docs.ceph.com/en/latest/install/get-packages/
    > * Release git sha1: c2214eb5df9fa034cc571d81a32a5414d60f0405
    >
    > [0] https://github.com/ceph/ceph/pull/49006
    > [1] https://github.com/ceph/ceph/pull/51821
    > _______________________________________________
    > ceph-users mailing list -- ceph-users@xxxxxxx
    > To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Best Regards,
Mark Nelson
Head of R&D (USA)

Clyso GmbH
p: +49 89 21552391 12
a: Loristraße 8 | 80335 München | Germany
w: https://clyso.com | e: mark.nelson@xxxxxxxxx

We are hiring: https://www.clyso.com/jobs/
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx