Request for community feedback: Telemetry Performance Channel

Laura Flores <lflores@xxxxxxxxxx> · Mon, 13 Dec 2021 15:52:56 -0600

Dear Ceph users,

I'm writing to inform the community about a new performance channel
that will be added to the telemetry module in the upcoming Quincy
release. Like all other channels, this channel is also on an opt-in
basis, but we’d like to know if there are any concerns regarding this
new collection and whether users would feel comfortable sharing
performance related data with us. Please review the details below and
respond to this email with any thoughts or questions you might have.

We’ll also discuss this topic live at our next "Ceph User + Dev
Monthly Meetup", which will be held this week on December 16th. Feel
free to join this meeting and provide direct feedback to developers.

U+D meeting details:
https://calendar.google.com/calendar/u/0/embed?src=9ts9c7lt7u1vic2ijvvqqlfpo0@xxxxxxxxxxxxxxxxxxxxxxxxx

-------------------------------------------------------------------------------------------------------------------------------

The telemetry module has been around since Luminous v12.2.13.
Operating on a strict opt-in basis, the telemetry module sends
anonymous data about Ceph users’ clusters securely back to the Ceph
developers. This data, which is displayed on public dashboards [1],
helps developers understand how Ceph is used and what problems users
may be experiencing. The telemetry module is divided into several
channels, each of which collects a different set of information.
Existing channels include "basic", "crash", "device", and "ident". All
existing channels, as well as future channels, offer users the choice
to opt-in. See the latest documentation for more details:
https://docs.ceph.com/en/latest/mgr/telemetry/

For the upcoming Quincy release, we have designed a new performance
channel ("perf") that collects various performance metrics across a
cluster. As developers, we would like to use data from the perf
channel to:

    1. gain a better understanding of how clusters are used
    2. discover changes in cluster usage over time
    3. identify performance-related bottlenecks
    4. model benchmarks used in upstream testing on workload patterns
seen in the field
    5. suggest better Ceph configuration values based on use case

In addition, and most importantly, we have designed the perf channel
with users in mind. Our goal is to provide users with better access to
detailed performance information about their clusters that they can
find all in one place. With this performance data, we aim to provide
users the ability to:

    1. monitor their own cluster’s performance by daemon type
    2. access detailed information about their cluster's overall health
    3. identify patterns in their cluster’s workload
    4. troubleshoot performance issues in their cluster, e.g. issues
with latency, throttling, or memory management

In the process of designing the perf channel, we also saw a need for
users to be able to view the data they are sending when telemetry is
on, as well as the data that is available to send when telemetry is
off. With this new design, a user can look at which collections they
are reporting when telemetry is on with the command `ceph telemetry
show`. If telemetry is off, meaning the user has not opted in to
sending data, they can preview a sample report with `ceph telemetry
preview`. This same flow can be followed by channel, if preferred:
`ceph telemetry show <channel_name>` or `ceph telemetry preview
<channel_name>`.

In the case of the perf channel, a user who is opted into telemetry
(telemetry is on) may view a report of the perf collection with `ceph
telemetry show perf`.  A user who is not opted into telemetry
(telemetry is off) may view a preview of the perf collection with
`ceph telemetry preview perf`.

Metrics in the perf channel are reported on an individual
daemon/pool/pg basis. As such, the length of the perf report will
depend on how many daemons, pools, and pgs a cluster has. We decided
to go this route instead of aggregating the metrics since aggregation
abstracts the data and makes it difficult to identify problems from
individual daemons, pools, and pgs. Metrics that the perf channel
collects can be summarized by these categories:

    1. perf_counters: All performance counters available to the
manager module that have a "USEFUL" priority or higher. Counters are
grouped by daemon types (e.g. mds, mon, osd, rgw). These metrics in
the perf channel look very similar to the output you would get from
the `ceph tell {daemon_type}.* perf dump` command.

    2. io_rate: The current change in IOPS done on the Ceph cluster,
fetched from the manager module. This includes a delta in pg log size,
store stats, and read/write operations. We use these same metrics
(num_read, num_read_kb, num_write, num_write_kb) to generate output
from the iostat module.

    3. osd_perf_histograms: 2d histograms that measure the
relationship between latency and request size due to certain
read/write operations on OSDs. The histograms collected in the perf
channel are derived from the `ceph tell osd.* perf histogram dump`
command.

    4. stats_per_pg: A dump of IOPS, log size, and scrubbing metrics
on a per-pg basis. We fetch this information from the manager module,
but the output can also be found in `ceph pg dump`.

    5. stats_per_pool: A dump of IOPS, pg log size, and store stats on
a per-pool basis. We fetch this information from the manager module,
but the output can also be found in `ceph pg dump`. We refrain from
collecting pool names here to avoid any sensitive information.

    6. mempool: Memory allocations grouped by container on a per-osd
basis. The mempool metrics collected in the perf channel are derived
from the `ceph tell osd.* dump_mempools` command.

We are still in the process of adding these metrics:

    1. rocksdb_stats: A dump of metrics used to analyze performance
from the RocksDB key-value store, such as compaction time. These
metrics will be derived from a new admin socket command that is
undergoing review.

    2. tcmalloc_heap_stats: A dump of tcmalloc heap profiles on a
per-osd basis. These metrics would be derived from the `ceph tell
osd.* heap dump` command.

    3. osd_dump_stats: Here, we are mainly interested in collecting
the pool applications (e.g. "rbd" or "mgr") so we have a better sense
for what purpose each pool is used. If we collect these metrics, we
would screen out any sensitive information such as pool names.

Attached at the bottom of this email are sample reports (with the perf
channel enabled) that we took from our Long Running Cluster (LRC). The
LRC’s services include 5 monitors, 3 managers, 3 mds, and 89 osds.
Data includes 19 pools and 2,833 pgs. In reviewing this report, you
can see the exact metrics that we are collecting in the perf channel,
and the exact structure in which those metrics will be presented in
the telemetry report.

At this point, we’d like to know if there are any concerns regarding
the data we plan to include in the performance report and whether
users are comfortable sharing it with us.

Thanks,
Laura Flores

[1] Telemetry Public Dashboards -- https://telemetry-public.ceph.com
[2] Sample Telemetry Full Report --
https://gist.github.com/ljflores/720d32e6d5b8a6f8f42d9eec0428d8da
[3] Sample Telemetry Perf Report --
https://gist.github.com/ljflores/78a5764dc97d73dd63b341929976ae55

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx