Hi Karan,
Thanks for your reply, OK I have spent some time on it and finally found a problem regarding this issue
1) If I reboot any of the node, and when its back then the OSD service are not start due to unmount of /var/lib/ceph/osd/ceph-0
then I manually edit /etc/fstab and add the mount point of ceph osd storage e.g.
UUID=142136cd-8325-44a7-ad67-80fe19ed3873 /var/lib/ceph/osd/ceph-0 xfs defaults,noatime
the above fixed the iss. Now question: is the valid approach? and why on reboot ceph not activated the osd drive?
2) After fixing the above issue I again reboot my all nodes, now this time there is another warning
health HEALTH_WARN clock skew detected on mon.vms2
here is the output
health HEALTH_WARN clock skew detected on mon.vms2
monmap e1: 2 mons at {vms1=192.168.1.128:6789/0,vms2=192.168.1.129:6789/0}, election epoch 14, quorum 0,1 vms1,vms2
mdsmap e11: 1/1/1 up {0=vms1=up:active}
osdmap e36: 3 osds: 3 up, 3 in
My current setup is 3 osd, 2 mons and 1 msd
Br.
Umar
On Tue, Dec 17, 2013 at 2:54 PM, Karan Singh <ksingh@xxxxxx> wrote:
UmarCeph is stable for production , there are a large number of ceph clusters deployed and running smoothly in PRODUCTIONS and countless in testing / pre-production.Since you are facing problems with your ceph testing , it does not mean CEPH is unstable.I would suggest put some time troubleshooting your problem.What i see from your logs --1) you have 2 Mons thats a problem ( either have 1 or have 3 to form quorum ) . Add 1 more monitor node2) out of 2 OSD , only 1 is IN , check where is the other one and try bringing both of them UP . Add few more OSD's to remove health warning . 2 OSD is a very less numbers for OSDMany Thanks
Karan SinghFrom: "Umar Draz" <unix.co@xxxxxxxxx>
To: ceph-users@xxxxxxxx
Sent: Tuesday, 17 December, 2013 8:51:27 AM
Subject: After reboot nothing worked_______________________________________________Hello,I have 2 node ceph cluster, I just rebooted both of the host just for testing that after rebooting the cluster remain work or not, and the result was cluster unable to start.here is ceph -s outputhealth HEALTH_WARN 704 pgs stale; 704 pgs stuck stale; mds cluster is degraded; 1/1 in osds are down; clock skew detected on mon.kvm2monmap e2: 2 mons at {kvm1=192.168.214.10:6789/0,kvm2=192.168.214.11:6789/0}, election epoch 16, quorum 0,1 kvm1,kvm2mdsmap e13: 1/1/1 up {0=kvm1=up:replay}osdmap e29: 2 osds: 0 up, 1 inpgmap v68: 704 pgs, 4 pools, 9603 bytes data, 23 objects1062 MB used, 80816 MB / 81879 MB avail704 stale+active+cleanaccording to this useless documentation.I tried ceph osd treethe output was# id weight type name up/down reweight-1 0.16 root default-2 0.07999 host kvm10 0.07999 osd.0 down 1-3 0.07999 host kvm21 0.07999 osd.1 down 0Then i triedsudo /etc/init.d/ceph -a start osd.0sudo /etc/init.d/ceph -a start osd.1to start the osd on both host the result was/etc/init.d/ceph: osd.0 not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines )
/etc/init.d/ceph: osd.1 not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines )Now question is what is this? is really ceph is stable? can we use this for production environment?My both host has ntp running the time is upto date.Br.Umar
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Umar Draz
Network Architect
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com