Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error 2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970 common/Thread.cc: 129: FAILED assert(ret == 0) Environment : 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? Cluster status cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33 health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1 736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19 .501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03 monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789 /0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03 osdmap e26633: 239 osds: 85 up, 196 in pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects 4699 GB used, 707 TB / 711 TB avail 6061/31080 objects degraded (19.501%) 14 down+remapped+peering 39 active 3289 active+clean 547 peering 663 stale+down+peering 705 stale+active+remapped 1 active+degraded+remapped 1 stale+down+incomplete 484 down+peering 455 active+remapped 3696 stale+active+degraded 4 remapped+peering 23 stale+down+remapped+peering 51 stale+active 3637 active+degraded 3799 stale+active+clean OSD : Logs 2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970 common/Thread.cc: 129: FAILED assert(ret == 0) ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7) 1: (Thread::create(unsigned long)+0x8a) [0xaf41da] 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa] 3: (Accepter::entry()+0x265) [0xb5c635] 4: /lib64/libpthread.so.0() [0x3c8a6079d1] 5: (clone()+0x6d) [0x3c8a2e89dd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. More information at Ceph Tracker Issue : http://tracker.ceph.com/issues/10988#change-49018 **************************************************************** Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ **************************************************************** |
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com