Hi Christian. Thank you for your comments again. Very helpful. I will try to fix the current pool and see how it goes. Its good to learn some troubleshooting skills. Regarding the BTRFS vs XFS, not sure if the documentation is old. My decision was based on this: http://ceph.com/docs/master/rados/configuration/filesystem-recommendations/ Note We currently recommend XFS for production deployments. We recommend btrfs for testing, development, and any non-critical deployments. We believe that btrfs has the correct feature set and roadmap to serve Ceph in the long-term, but XFS and ext4 provide the necessary stability for today’s deployments. btrfs development is proceeding rapidly: users should be comfortable installing the latest released upstream kernels and be able to track development activity for critical bug fixes. Thanks Jiri On 28/12/2014 16:01, Christian Balzer
wrote:
Hello, On Sun, 28 Dec 2014 11:58:59 +1100 jirik@xxxxxxxxxx wrote:Hi Christian. Thank you for your suggestions. I will set the "osd pool default size" to 2 as you recommended. As mentioned the documentation is talking about OSDs, not nodes, so that must have confused me.Note that changing this will only affect new pools of course. So to sort out your current state either start over with this value set before creating/starting anything or reduce the current size (ceph osd pool set <poolname> size). Have a look at the crushmap example or even better your own, current one and you will see where by default the host is the failure domain. Which of course makes a lot of sense.Regarding the BTRFS, i thought that btrfs is better option for the future providing more features. I know that XFS might be more stable, but again my impression was that btrfs is the focus for future development. Is that correct?I'm not a developer, but if you scour the ML archives you will find a number of threads about BTRFS (and ZFS). The biggest issues with BTRFS are not just stability but also the fact that it degrades rather quickly (fragmentation) due to the COW nature of it and less smarts than ZFS in that area. So development on the Ceph side is not the issue per se. IMHO BTRFS looks more and more stillborn and with regard to Ceph ZFS might become the better choice (in the future), with KV store backends being an alternative for some use cases (also far from production ready at this time). Regards, ChristianYou are right with the round up. I forgot about that. Thanks for your help. Much appreciated. Jiri ----- Reply message ----- From: "Christian Balzer" <chibi@xxxxxxx> To: <ceph-users@xxxxxxxx> Cc: "Jiri Kanicky" <jirik@xxxxxxxxxx> Subject: HEALTH_WARN 29 pgs degraded; 29 pgs stuck degraded; 133 pgs stuck unclean; 29 pgs stuck undersized; Date: Sun, Dec 28, 2014 03:29 Hello, On Sun, 28 Dec 2014 01:52:39 +1100 Jiri Kanicky wrote:Hi, I just build my CEPH cluster but having problems with the health of the cluster.You're not telling us the version, but it's clearly 0.87 or beyond.Here are few details: - I followed the ceph documentation.Outdated, unfortunately.- I used btrfs filesystem for all OSDsBig mistake number 1, do some research (google, ML archives). Though not related to to your problems.- I did not set "osd pool default size = 2 " as I thought that if I have 2 nodes + 4 OSDs, I can leave default=3. I am not sure if this was right.Big mistake, assumption number 2, replications size by the default CRUSH rule is determined by hosts. So that's your main issue here. Either set it to 2 or use 3 hosts.- I noticed that default pools "data,metadata" were not created. Only "rbd" pool was created.See outdated docs above. The majority of use cases is with RBD, so since Giant the cephfs pools are not created by default.- As it was complaining that the pg_num is too low, I increased the pg_num for pool rbd to 133 (400/3) and end up with "pool rbd pg_num 133 > pgp_num 64".Re-read the (in this case correct) documentation. It clearly states to round up to nearest power of 2, in your case 256. Regards. ChristianWould you give me hint where I have made the mistake? (I can remove the OSDs and start over if needed.) cephadmin@ceph1:/etc/ceph$ sudo ceph health HEALTH_WARN 29 pgs degraded; 29 pgs stuck degraded; 133 pgs stuck unclean; 29 pgs stuck undersized; 29 pgs undersized; pool rbd pg_num 133 > pgp_num 64 cephadmin@ceph1:/etc/ceph$ sudo ceph status cluster bce2ff4d-e03b-4b75-9b17-8a48ee4d7788 health HEALTH_WARN 29 pgs degraded; 29 pgs stuck degraded; 133 pgs stuck unclean; 29 pgs stuck undersized; 29 pgs undersized; pool rbd pg_num 133 > pgp_num 64 monmap e1: 2 mons at {ceph1=192.168.30.21:6789/0,ceph2=192.168.30.22:6789/0}, election epoch 8, quorum 0,1 ceph1,ceph2 osdmap e42: 4 osds: 4 up, 4 in pgmap v77: 133 pgs, 1 pools, 0 bytes data, 0 objects 11704 kB used, 11154 GB / 11158 GB avail 29 active+undersized+degraded 104 active+remapped cephadmin@ceph1:/etc/ceph$ sudo ceph osd tree # id weight type name up/down reweight -1 10.88 root default -2 5.44 host ceph1 0 2.72 osd.0 up 1 1 2.72 osd.1 up 1 -3 5.44 host ceph2 2 2.72 osd.2 up 1 3 2.72 osd.3 up 1 cephadmin@ceph1:/etc/ceph$ sudo ceph osd lspools 0 rbd, cephadmin@ceph1:/etc/ceph$ cat ceph.conf [global] fsid = bce2ff4d-e03b-4b75-9b17-8a48ee4d7788 public_network = 192.168.30.0/24 cluster_network = 10.1.1.0/24 mon_initial_members = ceph1, ceph2 mon_host = 192.168.30.21,192.168.30.22 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true Thank you Jiri |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com