Re: cant get cluster to become healthy. "stale+undersized+degraded+peered"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

looking at the outputs below the following puzzles me:

You have two nodes but repl.size 3 for your test-data pool. With the
default crushmap this won't work as it tries to replicate on different
nodes.

So either change to rep.size 2, or add another node ;-)

best regards,
Kurt

Jogi Hofmüller wrote:
> Hi,
> 
> Some more info:
> 
> ceph osd tree
> ID WEIGHT  TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY
> -1 3.59998 root default
> -2 1.79999     host ceph1
>  0 0.89999         osd.0       up  1.00000          1.00000
>  1 0.89999         osd.1       up  1.00000          1.00000
> -3 1.79999     host ceph2
>  2 0.89999         osd.2       up  1.00000          1.00000
>  3 0.89999         osd.3       up  1.00000          1.00000
> 
> 
> With on pool that contains no objects:
> 
> ceph status
>     cluster 2d766dc4-0705-46f9-b559-664e49e0da5c
>      health HEALTH_WARN
>             128 pgs degraded
>             128 pgs stuck degraded
>             128 pgs stuck unclean
>             128 pgs stuck undersized
>             128 pgs undersized
>      monmap e1: 1 mons at {ceph1=172.16.16.17:6789/0}
>             election epoch 2, quorum 0 ceph1
>      osdmap e22: 4 osds: 4 up, 4 in
>       pgmap v45: 128 pgs, 1 pools, 0 bytes data, 0 objects
>             6768 kB used, 3682 GB / 3686 GB avail
>                  128 active+undersized+degraded
> 
> ceph osd dump
> epoch 22
> fsid 2d766dc4-0705-46f9-b559-664e49e0da5c
> created 2015-09-30 16:09:58.109963
> modified 2015-09-30 16:46:00.625417
> flags
> pool 1 'test-data' replicated size 3 min_size 2 crush_ruleset 0
> object_hash rjenkins pg_num 128 pgp_num 128 last_change 21 flags
> hashpspool stripe_width 0
> max_osd 4
> osd.0 up   in  weight 1 up_from 4 up_thru 21 down_at 0
> last_clean_interval [0,0) PUB.17:6800/11953 172.16.16.17:6800/11953
> 172.16.16.17:6801/11953 PUB.17:6801/11953 exists,up
> e384b160-d213-40a4-b3f1-a9146aaa41e1
> osd.1 up   in  weight 1 up_from 8 up_thru 21 down_at 0
> last_clean_interval [0,0) PUB.17:6802/12839 172.16.16.17:6802/12839
> 172.16.16.17:6803/12839 PUB.17:6803/12839 exists,up
> 4c14bda4-3c31-4188-976e-7f59fd717294
> osd.2 up   in  weight 1 up_from 12 up_thru 21 down_at 0
> last_clean_interval [0,0) PUB.18:6800/6583 172.16.16.18:6800/6583
> 172.16.16.18:6801/6583 89.106.208.18:6801/6583 exists,up
> 3dd88154-63b7-476d-b8c2-8a34483eb358
> osd.3 up   in  weight 1 up_from 17 up_thru 21 down_at 0
> last_clean_interval [0,0) PUB.18:6802/7453 172.16.16.18:6802/7453
> 172.16.16.18:6803/7453 PUB.18:6803/7453 exists,up
> 1a96aa8d-c13d-4536-b772-b4189e0069ff
> 
> After deleting the pool:
> 
> ceph status
>     cluster 2d766dc4-0705-46f9-b559-664e49e0da5c
>      health HEALTH_WARN
>             too few PGs per OSD (0 < min 30)
>      monmap e1: 1 mons at {ceph1=172.16.16.17:6789/0}
>             election epoch 2, quorum 0 ceph1
>      osdmap e23: 4 osds: 4 up, 4 in
>       pgmap v48: 0 pgs, 0 pools, 0 bytes data, 0 objects
>             6780 kB used, 3682 GB / 3686 GB avail
> ceph osd dump
> epoch 23
> fsid 2d766dc4-0705-46f9-b559-664e49e0da5c
> created 2015-09-30 16:09:58.109963
> modified 2015-09-30 16:56:24.678984
> flags
> max_osd 4
> osd.0 up   in  weight 1 up_from 4 up_thru 21 down_at 0
> last_clean_interval [0,0) PUB.17:6800/11953 172.16.16.17:6800/11953
> 172.16.16.17:6801/11953 PUB.17:6801/11953 exists,up
> e384b160-d213-40a4-b3f1-a9146aaa41e1
> osd.1 up   in  weight 1 up_from 8 up_thru 21 down_at 0
> last_clean_interval [0,0) PUB.17:6802/12839 172.16.16.17:6802/12839
> 172.16.16.17:6803/12839 89.106.208.17:6803/12839 exists,up
> 4c14bda4-3c31-4188-976e-7f59fd717294
> osd.2 up   in  weight 1 up_from 12 up_thru 21 down_at 0
> last_clean_interval [0,0) PUB.18:6800/6583 172.16.16.18:6800/6583
> 172.16.16.18:6801/6583 PUB.18:6801/6583 exists,up
> 3dd88154-63b7-476d-b8c2-8a34483eb358
> osd.3 up   in  weight 1 up_from 17 up_thru 21 down_at 0
> last_clean_interval [0,0) PUB.18:6802/7453 172.16.16.18:6802/7453
> 172.16.16.18:6803/7453 PUB.18:6803/7453 exists,up
> 1a96aa8d-c13d-4536-b772-b4189e0069ff
> 
> Regards,
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux