Re: Need feedback for Ceph User Survey 2019

Lars Marowsky-Bree <lmb@xxxxxxxx> · Wed, 2 Oct 2019 18:15:53 +0200

On 2019-10-01T16:08:43, Mike Perez <miperez@xxxxxxxxxx> wrote:

Hi all,

yay, survey time!

> We conduct yearly user surveys to better under how our users utilize Ceph.
> The Ceph Foundation collects the data under the Community Data License
> agreement [0]; which helps the community make more of an informed decision
> of where our efforts in the development of future releases should go.

I also like we're collecting this under the CDLA 1.0 Sharing variant
(means we need to avoid any e-mail addresses and org names though, I
think; folks probably don't want those globally shared).

> A second question that came up was how to layout questions for multiple
> cluster deployments. An idea I had was having our general Ceph user survey
> [2] separate from the deployment questions [3]. The general questions only
> need to be answered once, and the deployment survey can be answered
> multiple times to capture the different configurations. I'm looking into a
> way to link the answers of both surveys together.

I think, perhaps, we can just ask for the aggregates across all
clusters. If we make it too detailed, it'll be too complex for
respondents to enter and we'll not hear from them.

Alternatively, we could perhaps have a table for the per-cluster
questions, each row representing one cluster or even pool.

(e.g., if they have a 10 PiB cluster hosting both hosting 1 PiB
replicated metadata and hot data and 6 PiB 8+4 S3 data, what is the
answer?)

I guess my point is - unless we're going to that level of detail, we're
already aggregating (and losing details) at the per-cluster level and
per-org aggregation isn't too bad. I'd rather lower the barrier to
respond -> "check all that apply" (would make most of our questions
multiple choice).

We should instead focus on getting that level of detail from Telemetry
in 2020.

An intermediate solution could be to ask them to "run this command on
each of your clusters and paste the output here". But if we ask them to
spend 5-10 minutes answering questions per cluster ...

(Alas we can't ask them to just turn on Telemetry, not backported and
feature complete to all relevant releases. Perhaps we could build a
standalone Telemetry client for pre-Nautilus releases? But not for this
cycle.)

For the fields where we do ask numbers, instead of endless drop-down
lists, I'd rather ask for a, well, number. Why give them a drop-down
list for total raw capacity? Why not just ask for the number of
clusters? How many nodes? Etc, even for replication size/EC profiles.

And we should filter redundant questions - if we ask, say, for both the
total number of nodes, and the total number of OSDs, we don't have to
ask "how many OSDs per node". Unless we consider this a consistency
check.

The survey pad has a lot of feedback, some of it contradictory, and not
all questions asked consistently. So it's not perfectly clear to me what
the current consolidated draft would look like.

Perhaps if we do that prior to posting to ceph-users, that'd be
helpful.

Regards,
    Lars

-- 
SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
"Architects should open possibilities and not determine everything." (Ueli Zbinden)
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx