Well, what's "best" really depends on your needs and use-case. The general advise which has been floated several times now is to have at least N+2 entities of your failure domain in your cluster.
So for example if you run with size=3 then you should have at least 5 OSDs if your failure domain is OSD and 5 hosts if your failure domain is host.On Sat, Jun 3, 2017 at 3:20 AM, Oleg Obleukhov <leoleovich@xxxxxxxxx> wrote:
But what would be the best? Have 3 servers and how many osd?Thanks!On 2 Jun 2017, at 17:09, David Turner <drakonstein@xxxxxxxxx> wrote:That's good for testing in the small scale. For production I would revisit using size 3. Glad you got it working.On Fri, Jun 2, 2017 at 11:02 AM Oleg Obleukhov <leoleovich@xxxxxxxxx> wrote:Thanks to everyone,problem is solved by:ceph osd pool set cephfs_metadata size 2ceph osd pool set cephfs_data size 2Best, Oleg.On 2 Jun 2017, at 16:15, Oleg Obleukhov <leoleovich@xxxxxxxxx> wrote:Hello,I am playing around with ceph (ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d7 65af245185)) on Debian Jessie and I build a test setup: $ ceph osd treeID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 0.01497 root default-2 0.00499 host af-staging-ceph010 0.00499 osd.0 up 1.00000 1.00000-3 0.00499 host af-staging-ceph021 0.00499 osd.1 up 1.00000 1.00000-4 0.00499 host af-staging-ceph032 0.00499 osd.2 up 1.00000 1.00000So I have 3 osd on 3 servers.I also created 2 pools:ceph osd dump | grep 'replicated size'pool 1 'cephfs_data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 32 pgp_num 32 last_change 33 flags hashpspool crash_replay_interval 45 stripe_width 0pool 2 'cephfs_metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 32 pgp_num 32 last_change 31 flags hashpspool stripe_width 0Now I am testing failover and kill one of servers:ceph osd treeID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 0.01497 root default-2 0.00499 host af-staging-ceph010 0.00499 osd.0 up 1.00000 1.00000-3 0.00499 host af-staging-ceph021 0.00499 osd.1 down 1.00000 1.00000-4 0.00499 host af-staging-ceph032 0.00499 osd.2 up 1.00000 1.00000And now it stuck in the recovery state:ceph -scluster 6b5ff07a-7232-4840-b486-6b7906248de7 health HEALTH_WARN64 pgs degraded18 pgs stuck unclean64 pgs undersizedrecovery 21/63 objects degraded (33.333%)1/3 in osds are down1 mons down, quorum 0,2 af-staging-ceph01,af-staging-ceph03 monmap e1: 3 mons at {af-staging-ceph01=10.36.0.121:6789/0,af-staging-ceph02= }10.36.0.122:6789/0,af-staging- ceph03=10.36.0.123:6789/0 election epoch 38, quorum 0,2 af-staging-ceph01,af-staging-ceph03 fsmap e29: 1/1/1 up {0=af-staging-ceph03.crm.ig.local=up:active}, 2 up:standby osdmap e78: 3 osds: 2 up, 3 in; 64 remapped pgsflags sortbitwise,require_jewel_osdspgmap v334: 64 pgs, 2 pools, 47129 bytes data, 21 objects122 MB used, 15204 MB / 15326 MB avail21/63 objects degraded (33.333%)64 active+undersized+degradedAnd if I kill one more node I lose access to mounted file system on client.Normally I would expect replica-factor to be respected and ceph should create the missing copies of degraded pg.I was trying to rebuild the crush map and it looks like this, but this did not help:rule replicated_ruleset {ruleset 0type replicatedmin_size 1max_size 10step take defaultstep chooseleaf firstn 0 type osdstep emit}# end crush mapWould very appreciate help,Thank you very much in advance,Oleg.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com