I saw these statement from this link ( http://docs.ceph.com/docs/master/rados/operations/crush-map/ ), it that the reason which leads to the warning? " This, combined with the default CRUSH failure domain, ensures that replicas or erasure code shards are separated across hosts and a single host failure will not affect availability." Best Regards, Dave Chen -----Original Message----- From: Chen2, Dave Sent: Friday, June 22, 2018 1:59 PM To: 'Burkhard Linke'; ceph-users@xxxxxxxxxxxxxx Cc: Chen2, Dave Subject: RE: PG status is "active+undersized+degraded" Hi Burkhard, Thanks for your explanation, I created an new OSD with 2TB from another node, it truly solved the issue, the status of Ceph cluster is " health HEALTH_OK" now. Another question is if three homogeneous OSD is spread across 2 nodes, I still got the warning message, and the status is "active+undersized+degraded", so does the three OSD spread across 3 nodes are mandatory rules for Ceph? Is that only for the HA consideration? Any official documents from Ceph has some guide on this? $ ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 7.25439 root default -2 1.81360 host ceph3 2 1.81360 osd.2 up 1.00000 1.00000 -4 3.62720 host ceph1 0 1.81360 osd.0 up 1.00000 1.00000 1 1.81360 osd.1 up 1.00000 1.00000 Best Regards, Dave Chen -----Original Message----- From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Burkhard Linke Sent: Thursday, June 21, 2018 2:39 PM To: ceph-users@xxxxxxxxxxxxxx Subject: Re: PG status is "active+undersized+degraded" Hi, On 06/21/2018 05:14 AM, Dave.Chen@xxxxxxxx wrote: > Hi all, > > I have setup a ceph cluster in my lab recently, the configuration per my understanding should be okay, 4 OSD across 3 nodes, 3 replicas, but couple of PG stuck with state "active+undersized+degraded", I think this should be very generic issue, could anyone help me out? > > Here is the details about the ceph cluster, > > $ ceph -v (jewel) > ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe) > > # ceph osd tree > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > -1 5.89049 root default > -2 1.81360 host ceph3 > 2 1.81360 osd.2 up 1.00000 1.00000 > -3 0.44969 host ceph4 > 3 0.44969 osd.3 up 1.00000 1.00000 > -4 3.62720 host ceph1 > 0 1.81360 osd.0 up 1.00000 1.00000 > 1 1.81360 osd.1 up 1.00000 1.00000 *snipsnap* You have a large difference in the capacities of the nodes. This results in a different host weight, which in turn might lead to problems with the crush algorithm. It is not able to get three different hosts for OSD placement for some of the PGs. CEPH and crush do not cope well with heterogenous setups. I would suggest to move one of the OSDs from host ceph1 to ceph4 to equalize the host weight. Regards, Burkhard _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com