Re: Recovery stuck in active+undersized+degraded

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Well, what's "best" really depends on your needs and use-case. The general advise which has been floated several times now is to have at least N+2 entities of your failure domain in your cluster.
So for example if you run with size=3 then you should have at least 5 OSDs if your failure domain is OSD and 5 hosts if your failure domain is host.


On Sat, Jun 3, 2017 at 3:20 AM, Oleg Obleukhov <leoleovich@xxxxxxxxx> wrote:
But what would be the best? Have 3 servers and how many osd?
Thanks!

On 2 Jun 2017, at 17:09, David Turner <drakonstein@xxxxxxxxx> wrote:

That's good for testing in the small scale.  For production I would revisit using size 3.  Glad you got it working.

On Fri, Jun 2, 2017 at 11:02 AM Oleg Obleukhov <leoleovich@xxxxxxxxx> wrote:
Thanks to everyone,
problem is solved by:
ceph osd pool set cephfs_metadata size 2
ceph osd pool set cephfs_data size 2

Best, Oleg.
On 2 Jun 2017, at 16:15, Oleg Obleukhov <leoleovich@xxxxxxxxx> wrote:

Hello,
I am playing around with ceph (ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)) on Debian Jessie and I build a test setup:

$ ceph osd tree
ID WEIGHT  TYPE NAME                  UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.01497 root default
-2 0.00499     host af-staging-ceph01
 0 0.00499         osd.0                   up  1.00000          1.00000
-3 0.00499     host af-staging-ceph02
 1 0.00499         osd.1                   up  1.00000          1.00000
-4 0.00499     host af-staging-ceph03
 2 0.00499         osd.2                   up  1.00000          1.00000

So I have 3 osd on 3 servers.
I also created 2 pools:

ceph osd dump | grep 'replicated size'
pool 1 'cephfs_data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 32 pgp_num 32 last_change 33 flags hashpspool crash_replay_interval 45 stripe_width 0
pool 2 'cephfs_metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 32 pgp_num 32 last_change 31 flags hashpspool stripe_width 0

Now I am testing failover and kill one of servers:
ceph osd tree
ID WEIGHT  TYPE NAME                  UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.01497 root default
-2 0.00499     host af-staging-ceph01
 0 0.00499         osd.0                   up  1.00000          1.00000
-3 0.00499     host af-staging-ceph02
 1 0.00499         osd.1                 down  1.00000          1.00000
-4 0.00499     host af-staging-ceph03
 2 0.00499         osd.2                   up  1.00000          1.00000

And now it stuck in the recovery state:
ceph -s
    cluster 6b5ff07a-7232-4840-b486-6b7906248de7
     health HEALTH_WARN
            64 pgs degraded
            18 pgs stuck unclean
            64 pgs undersized
            recovery 21/63 objects degraded (33.333%)
            1/3 in osds are down
            1 mons down, quorum 0,2 af-staging-ceph01,af-staging-ceph03
            election epoch 38, quorum 0,2 af-staging-ceph01,af-staging-ceph03
      fsmap e29: 1/1/1 up {0=af-staging-ceph03.crm.ig.local=up:active}, 2 up:standby
     osdmap e78: 3 osds: 2 up, 3 in; 64 remapped pgs
            flags sortbitwise,require_jewel_osds
      pgmap v334: 64 pgs, 2 pools, 47129 bytes data, 21 objects
            122 MB used, 15204 MB / 15326 MB avail
            21/63 objects degraded (33.333%)
                  64 active+undersized+degraded

And if I kill one more node I lose access to mounted file system on client.
Normally I would expect replica-factor to be respected and ceph should create the missing copies of degraded pg.

I was trying to rebuild the crush map and it looks like this, but this did not help:
rule replicated_ruleset {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type osd
step emit
}

# end crush map

Would very appreciate help,
Thank you very much in advance,
Oleg.




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux