Re: right pg_num value for CephFS Quick Start guide

JC Lopez <jelopez@xxxxxxxxxx> · Fri, 13 Sep 2019 12:16:47 -0700

Hi,
if you have the proper setup you should always reach active+clean for all your PGs,

- Single node with 2 OSDs: Rule replicates across OSD, set size=2 and min_size=1 on your pool
- Single node with 3 OSDs: Rule replicates across OSD (default will be size=3 min_size=2 on your pool)
- Multiple nodes with 2 nodes: Rule replicates across HOST, set size=2 and min_size=1 on your pool
- Multiple nodes with3 nodes: Rule replicates across HOST (default will be size=3 min_size=2 on your pool)

Tip: For single node deployment set osd_crush_chooseleaf_type = 0 in your configuration file [global] section before you deploy your MONs and OSDs and it will create the correct CRUSH rule

Regards
JC

On Sep 13, 2019, at 09:57, Rishabh Dave <ridave@xxxxxxxxxx> wrote:

On Wed, 11 Sep 2019 at 23:31, Sage Weil <sage@xxxxxxxxxxxx> wrote:

On Wed, 11 Sep 2019, Rishabh Dave wrote:
Hello,

While working on CephFS Quick Start guide[1], the major issue that I
came across was choosing the value for pg_num for the pools that will
serve CephFS. I've tried the values from 4 to 128 for both data and
metadata pools and have always got "undersized+peered" instead of
"active+clean". Copying pg_num values from the cluster setup by
vstart.sh (8 for data and 16 for metadata pools) gave me the same
result.

About the cluster: I had a single node running Fedora 29 with 1 MON, 1
MGR, 1 MDS and 3 OSDs each with a disk size of 10 GB. Thinking that

This is unrelated to the PGs or the capacity--the problem is that you have
a single node, and the default CRUSH rule replicates across hosts.
That's why your pools are unhealthy.

You can fix this by creating a new crush rule with 'osd' instead of
'host' as the failure domain, and then setting your pool(s) to use that
rule.

osd crush rule create-replicated <name> <root>  create crush rule <name> for replicated pool to
<type> {<class>}                                start from <root>, replicate across buckets of
                                                type <type>, use devices of type <class> (ssd
                                                or hdd)

osd pool set <poolname> crush_rule <rule-name>

sage

Both, setting the new crush rule to pools and creating OSDs on
separate nodes worked. Thanks!

However, in both cases the PG status wasn't "active+clean". It was "54
active+undersized" and "10 active+undersized+degraded". IMO, it would
ideal for CephFS Quick Start guide to lead to "active+clean". Is there
anything more that can be done and would be suitable for the guide?
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx

_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx