-Greg
On Fri, Feb 13, 2015 at 6:06 AM Mohamed Pakkeer <mdfakkeer@xxxxxxxxx> wrote:
Hi all,When i stop the respawning osd on an OSD node, another osd is respawning on the same node. when the OSD is started to respawing, it puts the following info in the osd log.slow request 31.129671 seconds old, received at 2015-02-13 19:09:32.180496: osd_op(osd.551.95229:11 191 100000005c4.00000033 [copy-get max 8388608] 13.f4ccd256 RETRY=50 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pgOSD.551 is part of cache tier. All the respawning osds have the log with different cache tier OSDs. If i restart all the osds in the cache tier osd node, respawning is stopped and cluster become active + clean state. But when i try to write some data on the cluster, random osd starts the respawning.can anyone help me how to solve this issue?2015-02-13 19:10:02.309848 7f53eef54700 0 log_channel(default) log [WRN] : 11 slow requests, 11 included below; oldest blocked for > 30.132629 secs2015-02-13 19:10:02.309854 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.132629 seconds old, received at 2015-02-13 19:09:32.177075: osd_op(osd.551.95229:63100000002ae.00000000 [copy-from ver 7622] 13.7273b256 RETRY=130 snapc 1=[] ondisk+retry+write+ignore_overlay+enforce_snapc+known_if_redirected e95518) currently reached_pg2015-02-13 19:10:02.309858 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.131608 seconds old, received at 2015-02-13 19:09:32.178096: osd_op(osd.551.95229:415 100000003a0.00000006 [copy-get max 8388608] 13.aefb256 RETRY=118 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg2015-02-13 19:10:02.309861 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.130994 seconds old, received at 2015-02-13 19:09:32.178710: osd_op(osd.551.95229:2683 1000000029d.0000003b [copy-get max 8388608] 13.a2be1256 RETRY=115 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg2015-02-13 19:10:02.309864 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.130426 seconds old, received at 2015-02-13 19:09:32.179278: osd_op(osd.551.95229:3939 100000004e9.00000032 [copy-get max 8388608] 13.6a25b256 RETRY=105 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg2015-02-13 19:10:02.309868 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.129697 seconds old, received at 2015-02-13 19:09:32.180007: osd_op(osd.551.95229:9749 10000000553.0000007e [copy-get max 8388608] 13.c8645256 RETRY=59 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg2015-02-13 19:10:03.310284 7f53eef54700 0 log_channel(default) log [WRN] : 11 slow requests, 6 included below; oldest blocked for > 31.133092 secs2015-02-13 19:10:03.310305 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.129671 seconds old, received at 2015-02-13 19:09:32.180496: osd_op(osd.551.95229:11191 100000005c4.00000033 [copy-get max 8388608] 13.f4ccd256 RETRY=50 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg2015-02-13 19:10:03.310308 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.128616 seconds old, received at 2015-02-13 19:09:32.181551: osd_op(osd.551.95229:12903 100000002e4.000000d6 [copy-get max 8388608] 13.f56a3256 RETRY=41 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg2015-02-13 19:10:03.310322 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.127807 seconds old, received at 2015-02-13 19:09:32.182360: osd_op(osd.551.95229:14165 10000000480.00000110 [copy-get max 8388608] 13.fd8c1256 RETRY=32 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg2015-02-13 19:10:03.310327 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.127320 seconds old, received at 2015-02-13 19:09:32.182847: osd_op(osd.551.95229:15013 1000000047f.00000133 [copy-get max 8388608] 13.b7b05256 RETRY=27 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg2015-02-13 19:10:03.310331 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.126935 seconds old, received at 2015-02-13 19:09:32.183232: osd_op(osd.551.95229:15767 1000000066d.0000001e [copy-get max 8388608] 13.3b017256 RETRY=25 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg2015-02-13 19:10:04.310685 7f53eef54700 0 log_channel(default) log [WRN] : 11 slow requests, 1 included below; oldest blocked for > 32.133566 secs2015-02-13 19:10:04.310705 7f53eef54700 0 log_channel(default) log [WRN] : slow request 32.126584 seconds old, received at 2015-02-13 19:09:32.184057: osd_op(osd.551.95229:16293 10000000601.00000029 [copy-get max 8388608] 13.293e1256 RETRY=25 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg2015-02-13 19:10:05.967407 7f4411770900 0 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578), process ceph-osd, pid 20717122015-02-13 19:10:05.971917 7f4411770900 0 filestore(/var/lib/ceph/osd/ceph-403) backend xfs (magic 0x58465342)2015-02-13 19:10:05.971936 7f4411770900 1 filestore(/var/lib/ceph/osd/ceph-403) disabling 'filestore replica fadvise' due to known issues with fadvise(DONTNEED) on xfs2015-02-13 19:10:06.009745 7f4411770900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_features: FIEMAP ioctl is supported and appears to work2015-02-13 19:10:06.009786 7f4411770900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option2015-02-13 19:10:06.026282 7f4411770900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)2015-02-13 19:10:06.026421 7f4411770900 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_feature: extsize is disabled by conf2015-02-13 19:10:06.178991 7f4411770900 0 filestore(/var/lib/ceph/osd/ceph-403) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2015-02-13 19:10:06.186378 7f4411770900 1 journal _open /var/lib/ceph/osd/ceph-403/journal fd 21: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-13 19:10:06.248640 7f4411770900 1 journal _open /var/lib/ceph/osd/ceph-403/journal fd 21: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-13 19:10:06.377309 7f4411770900 1 journal close /var/lib/ceph/osd/ceph-403/journal2015-02-13 19:10:06.449653 7f4411770900 0 filestore(/var/lib/ceph/osd/ceph-403) backend xfs (magic 0x58465342)2015-02-13 19:10:06.510328 7f4411770900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_features: FIEMAP ioctl is supported and appears to work2015-02-13 19:10:06.510362 7f4411770900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option2015-02-13 19:10:06.560259 7f4411770900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)2015-02-13 19:10:06.560353 7f4411770900 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_feature: extsize is disabled by conf2015-02-13 19:10:06.653577 7f4411770900 0 filestore(/var/lib/ceph/osd/ceph-403) mount: WRITEAHEAD journal mode explicitly enabled in conf2015-02-13 19:10:06.659761 7f4411770900 1 journal _open /var/lib/ceph/osd/ceph-403/journal fd 21: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-13 19:10:06.706124 7f4411770900 1 journal _open /var/lib/ceph/osd/ceph-403/journal fd 21: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-13 19:10:06.707848 7f4411770900 0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello2015-02-13 19:10:06.718958 7f4411770900 0 osd.403 95523 crush map has features 104186773504, adjusting msgr requires for clients2015-02-13 19:10:06.718994 7f4411770900 0 osd.403 95523 crush map has features 379064680448 was 8705, adjusting msgr requires for mons2015-02-13 19:10:06.719003 7f4411770900 0 osd.403 95523 crush map has features 379064680448, adjusting msgr requires for osds2015-02-13 19:10:06.719047 7f4411770900 0 osd.403 95523 load_pgs2015-02-13 19:10:07.289273 7f4411770900 0 osd.403 95523 load_pgs opened 187 pgs2015-02-13 19:10:07.290528 7f4411770900 -1 osd.403 95523 set_disk_tp_priority(22) Invalid argument: osd_disk_thread_ioprio_class is but only the following values are allowed:idle, be or rt2015-02-13 19:10:07.299139 7f43fe0d1700 0 osd.403 95523 ignoring osdmap until we have initialized2015-02-13 19:10:07.299273 7f43fe0d1700 0 osd.403 95523 ignoring osdmap until we have initialized2015-02-13 19:10:07.367439 7f4411770900 0 osd.403 95523 done with init, starting boot process2015-02-13 19:10:09.628008 7f43c2b3d700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.5:6938/15118444 pipe(0xc4d59c0 sd=459 :6836 s=0 pgs=0 cs=0 l=0 c=0xbd78c60).accept connect_seq 0 vs existing 0 state wait2015-02-13 19:10:09.633725 7f43c3c4e700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.13:6810/9067610 pipe(0xcf7cb00 sd=436 :6836 s=0 pgs=0 cs=0 l=0 c=0xd006ec0).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:09.670055 7f43b7f92700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.5:6802/10118805 pipe(0xd23b9c0 sd=539 :6836 s=0 pgs=0 cs=0 l=0 c=0xd1a5b20).accept connect_seq 0 vs existing 0 state wait2015-02-13 19:10:09.675371 7f43ba2b5700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.8:6930/8115813 pipe(0xd6c7440 sd=522 :6836 s=0 pgs=0 cs=0 l=0 c=0xd16f180).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:09.679692 7f43b7487700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.4:6886/11127316 pipe(0xd23a3c0 sd=546 :6836 s=0 pgs=0 cs=0 l=0 c=0xd1a5440).accept connect_seq 0 vs existing 0 state wait2015-02-13 19:10:09.708472 7f43b3e51700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.13:6879/12175877 pipe(0xd9da100 sd=570 :6836 s=0 pgs=0 cs=0 l=0 c=0xda589a0).accept connect_seq 0 vs existing 0 state wait2015-02-13 19:10:09.717141 7f43b0f22700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.7:6819/11132701 pipe(0xd8a5180 sd=596 :6836 s=0 pgs=0 cs=0 l=0 c=0xe251080).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:09.721672 7f43aff12700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.6:6804/19191298 pipe(0xd8a3340 sd=603 :6836 s=0 pgs=0 cs=0 l=0 c=0xe250b00).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:09.730813 7f43b1326700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.7:6825/1301227 pipe(0xe0c1c80 sd=593 :6836 s=0 pgs=0 cs=0 l=0 c=0xe2514a0).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:09.879344 7f43ad8ec700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.8:6845/15123594 pipe(0xe0bfb80 sd=621 :6836 s=0 pgs=0 cs=0 l=0 c=0xe250160).accept connect_seq 0 vs existing 0 state wait2015-02-13 19:10:09.888010 7f43ab8cc700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.2:6832/9203280 pipe(0xce6f8c0 sd=648 :6836 s=0 pgs=0 cs=0 l=0 c=0xe85c100).accept connect_seq 0 vs existing 0 state wait2015-02-13 19:10:09.897543 7f43a4559700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.6:6916/10181510 pipe(0xd975b80 sd=699 :6836 s=0 pgs=0 cs=0 l=0 c=0xe913c80).accept connect_seq 0 vs existing 0 state wait2015-02-13 19:10:09.901181 7f43a1c30700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.6:6872/17198411 pipe(0xe9d4ec0 sd=715 :6836 s=0 pgs=0 cs=0 l=0 c=0xed53340).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:09.904586 7f43a1a2e700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.1:6816/14116404 pipe(0xe9d4940 sd=717 :6836 s=0 pgs=0 cs=0 l=0 c=0xed53080).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:09.910772 7f43a071b700 0 -- 10.1.100.14:6836/2071712 >> :/0 pipe(0xe9d4680 sd=721 :6836 s=0 pgs=0 cs=0 l=0 c=0xed52f20).accept failed to getpeername (107) Transport endpoint is not connected2015-02-13 19:10:09.959742 7f439fd11700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.1:6835/17116573 pipe(0xe9d43c0 sd=727 :6836 s=0 pgs=0 cs=0 l=0 c=0xed52dc0).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:09.991344 7f439c4a6700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.6:6913/14182697 pipe(0xe9d3600 sd=756 :6836 s=0 pgs=0 cs=0 l=0 c=0xed526e0).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:10.099747 7f43a4256700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.13:6843/15181065 pipe(0xd975340 sd=702 :6836 s=0 pgs=0 cs=0 l=0 c=0xe913860).accept connect_seq 0 vs existing 0 state wait2015-02-13 19:10:10.246934 7f43919fc700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.1:6823/13119018 pipe(0xe9d3340 sd=840 :6836 s=0 pgs=0 cs=0 l=0 c=0xed52580).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:10.305592 7f4390aed700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.1:6922/10112411 pipe(0xe9d3080 sd=848 :6836 s=0 pgs=0 cs=0 l=0 c=0xed52420).accept connect_seq 0 vs existing 0 state wait2015-02-13 19:10:10.447464 7f438d0b3700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.1:6839/13117552 pipe(0xe9d2dc0 sd=876 :6836 s=0 pgs=0 cs=0 l=0 c=0xed522c0).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:10.528647 7f438c1a4700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.1:6841/10118584 pipe(0xe9d2b00 sd=884 :6836 s=0 pgs=0 cs=0 l=0 c=0xed52160).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:10.647182 7f4365e43700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.13:6936/10179964 pipe(0xe9d2840 sd=1229 :6836 s=0 pgs=0 cs=0 l=0 c=0xed52000).accept connect_seq 0 vs existing 0 state wait2015-02-13 19:10:10.763373 7f43619ff700 0 -- 10.1.100.14:6836/2071712 >> 10.1.100.13:6806/14167598 pipe(0xe9d2580 sd=1243 :6836 s=0 pgs=0 cs=0 l=0 c=0xa5f4940).accept connect_seq 0 vs existing 0 state wait2015-02-13 19:10:35.004540 7f2e9759b900 0 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578), process ceph-osd, pid 20741802015-02-13 19:10:35.008746 7f2e9759b900 0 filestore(/var/lib/ceph/osd/ceph-403) backend xfs (magic 0x58465342)2015-02-13 19:10:35.008768 7f2e9759b900 1 filestore(/var/lib/ceph/osd/ceph-403) disabling 'filestore replica fadvise' due to known issues with fadvise(DONTNEED) on xfs2015-02-13 19:10:35.035532 7f2e9759b900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_features: FIEMAP ioctl is supported and appears to work2015-02-13 19:10:35.035622 7f2e9759b900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option2015-02-13 19:10:35.068698 7f2e9759b900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)2015-02-13 19:10:35.068826 7f2e9759b900 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_feature: extsize is disabled by conf2015-02-13 19:10:35.204041 7f2e9759b900 0 filestore(/var/lib/ceph/osd/ceph-403) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2015-02-13 19:10:35.211697 7f2e9759b900 1 journal _open /var/lib/ceph/osd/ceph-403/journal fd 21: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-13 19:10:35.257182 7f2e9759b900 1 journal _open /var/lib/ceph/osd/ceph-403/journal fd 21: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-13 19:10:35.419868 7f2e9759b900 1 journal close /var/lib/ceph/osd/ceph-403/journal2015-02-13 19:10:35.447009 7f2e9759b900 0 filestore(/var/lib/ceph/osd/ceph-403) backend xfs (magic 0x58465342)2015-02-13 19:10:35.502898 7f2e9759b900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_features: FIEMAP ioctl is supported and appears to work2015-02-13 19:10:35.502929 7f2e9759b900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option2015-02-13 19:10:35.552837 7f2e9759b900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)2015-02-13 19:10:35.552945 7f2e9759b900 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-403) detect_feature: extsize is disabled by conf2015-02-13 19:10:35.663059 7f2e9759b900 0 filestore(/var/lib/ceph/osd/ceph-403) mount: WRITEAHEAD journal mode explicitly enabled in conf2015-02-13 19:10:35.669623 7f2e9759b900 1 journal _open /var/lib/ceph/osd/ceph-403/journal fd 21: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-13 19:10:35.714111 7f2e9759b900 1 journal _open /var/lib/ceph/osd/ceph-403/journal fd 21: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-13 19:10:35.715330 7f2e9759b900 0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello2015-02-13 19:10:35.722675 7f2e9759b900 0 osd.403 95527 crush map has features 104186773504, adjusting msgr requires for clients2015-02-13 19:10:35.722703 7f2e9759b900 0 osd.403 95527 crush map has features 379064680448 was 8705, adjusting msgr requires for mons2015-02-13 19:10:35.722708 7f2e9759b900 0 osd.403 95527 crush map has features 379064680448, adjusting msgr requires for osds2015-02-13 19:10:35.722728 7f2e9759b900 0 osd.403 95527 load_pgs2015-02-13 19:10:36.230034 7f2e9759b900 0 osd.403 95527 load_pgs opened 187 pgs2015-02-13 19:10:36.231327 7f2e9759b900 -1 osd.403 95527 set_disk_tp_priority(22) Invalid argument: osd_disk_thread_ioprio_class is but only the following values are allowed:idle, be or rt2015-02-13 19:10:36.239635 7f2e83cef700 0 osd.403 95527 ignoring osdmap until we have initialized2015-02-13 19:10:36.247880 7f2e83cef700 0 osd.403 95527 ignoring osdmap until we have initialized2015-02-13 19:10:36.322880 7f2e9759b900 0 osd.403 95527 done with init, starting boot process2015-02-13 19:10:38.395813 7f2e503d7700 0 -- 10.1.100.14:6838/2074180 >> 10.1.100.11:6858/4560 pipe(0xb58db80 sd=397 :6838 s=0 pgs=0 cs=0 l=0 c=0xb4652e0).accept connect_seq0 vs existing 0 state connecting2015-02-13 19:10:38.448288 7f2e43f13700 0 -- 10.1.100.14:6838/2074180 >> 10.1.100.15:6840/7116025 pipe(0xb045600 sd=506 :6838 s=0 pgs=0 cs=0 l=0 c=0xc59c580).accept connect_seq 0 vs existing 0 state connecting2015-02-13 19:10:38.505886 7f2e3b98e700 0 -- 10.1.100.14:6838/2074180 >> 10.1.100.2:6831/14199331 pipe(0xbe4a940 sd=585 :6838 s=0 pgs=0 cs=0 l=0 c=0xafb4580).accept connect_s--More--RegardsK.Mohamed PakkeerOn Thu, Feb 12, 2015 at 8:31 PM, Mohamed Pakkeer <mdfakkeer@xxxxxxxxx> wrote:Hi all,Cluster : 540 OSDs , Cache tier and EC poolceph version 0.87cluster c2a97a2f-fdc7-4eb5-82ef-70c52f2eceb1health HEALTH_WARN 10 pgs peering; 21 pgs stale; 2 pgs stuck inactive; 2 pgs stuck unclean; 287 requests are blocked > 32 sec; recovery 24/6707031 objects degraded (0.000%); too few pgs per osd (13 < min 20); 1/552 in osds are down; clock skew detected on mon.master02, mon.master03monmap e3: 3 mons at {master01=10.1.2.231:6789/0,master02=10.1.2.232:6789/0,master03=10.1.2.233:6789/0}, election epoch 4, quorum 0,1,2 master01,master02,master03mdsmap e17: 1/1/1 up {0=master01=up:active}osdmap e57805: 552 osds: 551 up, 552 inpgmap v278604: 7264 pgs, 3 pools, 2027 GB data, 547 kobjects3811 GB used, 1958 TB / 1962 TB avail24/6707031 objects degraded (0.000%)7 stale+peering3 peering7240 active+clean13 stale1 stale+activeWe have mounted ceph using ceph-fuse client . Suddenly some of osds are re spawning continuously. Still cluster health is unstable. How to stop the respawning osds?2015-02-12 18:41:51.562337 7f8371373900 0 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578), process ceph-osd, pid 39112015-02-12 18:41:51.564781 7f8371373900 0 filestore(/var/lib/ceph/osd/ceph-538) backend xfs (magic 0x58465342)2015-02-12 18:41:51.564792 7f8371373900 1 filestore(/var/lib/ceph/osd/ceph-538) disabling 'filestore replica fadvise' due to known issues with fadvise(DONTNEED) on xfs2015-02-12 18:41:51.655623 7f8371373900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: FIEMAP ioctl is supported and appears to work2015-02-12 18:41:51.655639 7f8371373900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option2015-02-12 18:41:51.663864 7f8371373900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)2015-02-12 18:41:51.663910 7f8371373900 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_feature: extsize is disabled by conf2015-02-12 18:41:51.994021 7f8371373900 0 filestore(/var/lib/ceph/osd/ceph-538) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2015-02-12 18:41:52.788178 7f8371373900 1 journal _open /var/lib/ceph/osd/ceph-538/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-12 18:41:52.848430 7f8371373900 1 journal _open /var/lib/ceph/osd/ceph-538/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-12 18:41:52.922806 7f8371373900 1 journal close /var/lib/ceph/osd/ceph-538/journal2015-02-12 18:41:52.948320 7f8371373900 0 filestore(/var/lib/ceph/osd/ceph-538) backend xfs (magic 0x58465342)2015-02-12 18:41:52.981122 7f8371373900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: FIEMAP ioctl is supported and appears to work2015-02-12 18:41:52.981137 7f8371373900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option2015-02-12 18:41:52.989395 7f8371373900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)2015-02-12 18:41:52.989440 7f8371373900 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_feature: extsize is disabled by conf2015-02-12 18:41:53.149095 7f8371373900 0 filestore(/var/lib/ceph/osd/ceph-538) mount: WRITEAHEAD journal mode explicitly enabled in conf2015-02-12 18:41:53.154258 7f8371373900 1 journal _open /var/lib/ceph/osd/ceph-538/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-12 18:41:53.217404 7f8371373900 1 journal _open /var/lib/ceph/osd/ceph-538/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-12 18:41:53.467512 7f8371373900 0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello2015-02-12 18:41:53.563846 7f8371373900 0 osd.538 54486 crush map has features 104186773504, adjusting msgr requires for clients2015-02-12 18:41:53.563865 7f8371373900 0 osd.538 54486 crush map has features 379064680448 was 8705, adjusting msgr requires for mons2015-02-12 18:41:53.563869 7f8371373900 0 osd.538 54486 crush map has features 379064680448, adjusting msgr requires for osds2015-02-12 18:41:53.563888 7f8371373900 0 osd.538 54486 load_pgs2015-02-12 18:41:55.430730 7f8371373900 0 osd.538 54486 load_pgs opened 137 pgs2015-02-12 18:41:55.432854 7f8371373900 -1 osd.538 54486 set_disk_tp_priority(22) Invalid argument: osd_disk_thread_ioprio_class is but only the following values are allowed:idle, be or rt2015-02-12 18:41:55.442748 7f835dfc8700 0 osd.538 54486 ignoring osdmap until we have initialized2015-02-12 18:41:55.456802 7f835dfc8700 0 osd.538 54486 ignoring osdmap until we have initialized2015-02-12 18:41:55.590831 7f8371373900 0 osd.538 54486 done with init, starting boot process2015-02-12 18:42:08.601833 7f830cead700 0 -- 10.1.100.14:6836/3911 >> 10.1.100.4:6843/4178616 pipe(0x12528680 sd=495 :0 s=1 pgs=0 cs=0 l=0 c=0x10246680).fault with nothing tosend, going to standby2015-02-12 18:42:10.460257 7f830be70700 0 -- 10.1.100.14:6836/3911 >> 10.1.100.14:6806/3483 pipe(0x12528680 sd=536 :0 s=1 pgs=0 cs=0 l=0 c=0x10b612e0).fault with nothing to send, going to standby2015-02-12 18:42:20.012175 7f830be70700 0 -- 10.1.100.14:6836/3911 >> 10.1.100.14:6806/3483 pipe(0x12528680 sd=536 :0 s=1 pgs=0 cs=1 l=0 c=0x10b612e0).fault2015-02-12 18:42:20.038834 7f82f1a9e700 0 -- 10.1.2.14:0/3911 >> 10.1.2.14:6810/3483 pipe(0x12324ec0 sd=844 :0 s=1 pgs=0 cs=0 l=1 c=0x1231dc80).fault2015-02-12 18:42:20.045447 7f82f1b9f700 0 -- 10.1.2.14:0/3911 >> 10.1.100.14:6807/3483 pipe(0x12325180 sd=846 :0 s=1 pgs=0 cs=0 l=1 c=0x1231dde0).fault2015-02-12 18:42:49.094270 7f836797c700 -1 osd.538 54728 heartbeat_check: no reply from osd.176 since back 2015-02-12 18:42:28.444361 front 2015-02-12 18:42:28.444361 (cutoff2015-02-12 18:42:29.094265)2015-02-12 18:42:49.622922 7f834cfa6700 -1 osd.538 54728 heartbeat_check: no reply from osd.176 since back 2015-02-12 18:42:33.345980 front 2015-02-12 18:42:28.444361 (cutoff2015-02-12 18:42:29.622919)2015-02-12 18:42:51.094801 7f836797c700 0 log_channel(default) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 30.960507 secs2015-02-12 18:42:51.094825 7f836797c700 0 log_channel(default) log [WRN] : slow request 30.960507 seconds old, received at 2015-02-12 18:42:20.134236: osd_op(osd.542.54048:1100000002ae.00000000 [copy-from ver 7622] 13.7273b256 RETRY=513 snapc 1=[] ondisk+retry+write+ignore_overlay+enforce_snapc+known_if_redirected e54708) currently reached_pg2015-02-12 18:42:53.354106 7f9655242900 0 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578), process ceph-osd, pid 846892015-02-12 18:42:53.359088 7f9655242900 0 filestore(/var/lib/ceph/osd/ceph-538) backend xfs (magic 0x58465342)2015-02-12 18:42:53.359116 7f9655242900 1 filestore(/var/lib/ceph/osd/ceph-538) disabling 'filestore replica fadvise' due to known issues with fadvise(DONTNEED) on xfs2015-02-12 18:42:53.395684 7f9655242900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: FIEMAP ioctl is supported and appears to work2015-02-12 18:42:53.395711 7f9655242900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option2015-02-12 18:42:53.445563 7f9655242900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)2015-02-12 18:42:53.445652 7f9655242900 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_feature: extsize is disabled by conf2015-02-12 18:42:53.579957 7f9655242900 0 filestore(/var/lib/ceph/osd/ceph-538) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2015-02-12 18:42:53.584720 7f9655242900 1 journal _open /var/lib/ceph/osd/ceph-538/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-12 18:42:53.626940 7f9655242900 1 journal _open /var/lib/ceph/osd/ceph-538/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-12 18:42:53.704585 7f9655242900 1 journal close /var/lib/ceph/osd/ceph-538/journal2015-02-12 18:42:53.734618 7f9655242900 0 filestore(/var/lib/ceph/osd/ceph-538) backend xfs (magic 0x58465342)2015-02-12 18:42:53.771148 7f9655242900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: FIEMAP ioctl is supported and appears to work2015-02-12 18:42:53.771179 7f9655242900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option2015-02-12 18:42:53.779389 7f9655242900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)2015-02-12 18:42:53.779449 7f9655242900 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_feature: extsize is disabled by conf2015-02-12 18:42:53.913933 7f9655242900 0 filestore(/var/lib/ceph/osd/ceph-538) mount: WRITEAHEAD journal mode explicitly enabled in conf2015-02-12 18:42:53.918308 7f9655242900 1 journal _open /var/lib/ceph/osd/ceph-538/journal fd 21: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-12 18:42:53.951526 7f9655242900 1 journal _open /var/lib/ceph/osd/ceph-538/journal fd 21: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-12 18:42:53.952920 7f9655242900 0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello2015-02-12 18:42:53.959341 7f9655242900 0 osd.538 54728 crush map has features 104186773504, adjusting msgr requires for clients2015-02-12 18:42:53.959356 7f9655242900 0 osd.538 54728 crush map has features 379064680448 was 8705, adjusting msgr requires for mons2015-02-12 18:42:53.959360 7f9655242900 0 osd.538 54728 crush map has features 379064680448, adjusting msgr requires for osds2015-02-12 18:42:53.959378 7f9655242900 0 osd.538 54728 load_pgs2015-02-12 18:42:54.306386 7f9655242900 0 osd.538 54728 load_pgs opened 137 pgs2015-02-12 18:42:54.307429 7f9655242900 -1 osd.538 54728 set_disk_tp_priority(22) Invalid argument: osd_disk_thread_ioprio_class is but only the following values are allowed:idle, be or rt2015-02-12 18:42:54.314711 7f9641cb7700 0 osd.538 54728 ignoring osdmap until we have initialized2015-02-12 18:42:54.314749 7f9641cb7700 0 osd.538 54728 ignoring osdmap until we have initialized2015-02-12 18:42:54.371560 7f9655242900 0 osd.538 54728 done with init, starting boot process2015-02-12 18:42:56.079385 7f95fde9b700 0 -- 10.1.100.14:6874/84689 >> 10.1.100.4:6861/15126717 pipe(0xacf5340 sd=504 :6874 s=0 pgs=0 cs=0 l=0 c=0x9d72c60).accept connect_seq0 vs existing 0 state connecting2015-02-12 18:42:56.160775 7f95eecaa700 0 -- 10.1.100.14:6874/84689 >> 10.1.100.5:6942/14126479 pipe(0xa814840 sd=624 :6874 s=0 pgs=0 cs=0 l=0 c=0xb321a20).accept connect_seq0 vs existing 0 state wait2015-02-12 18:42:56.170650 7f96000bd700 0 -- 10.1.100.14:6874/84689 >> 10.1.100.13:6808/14152675 pipe(0xaa41340 sd=486 :6874 s=0 pgs=0 cs=0 l=0 c=0xa8d9b20).accept connect_seq 0 vs existing 0 state connecting2015-02-12 18:42:56.215545 7f95e7533700 0 -- 10.1.100.14:6874/84689 >> 10.1.100.13:6903/11158823 pipe(0xb1da100 sd=683 :6874 s=0 pgs=0 cs=0 l=0 c=0xaea0260).accept connect_seq 0 vs existing 0 state connecting2015-02-12 18:42:56.222787 7f95e712f700 0 -- 10.1.100.14:6874/84689 >> 10.1.100.11:6831/10414111 pipe(0xb1d98c0 sd=686 :6874 s=0 pgs=0 cs=0 l=0 c=0xae9fe40).accept connect_seq 0 vs existing 0 state wait2015-02-12 18:42:56.471608 7f95d6a29700 0 -- 10.1.100.14:6874/84689 >> 10.1.100.6:6872/17198411 pipe(0xb593600 sd=813 :6874 s=0 pgs=0 cs=0 l=0 c=0xaf41b80).accept connect_seq0 vs existing 0 state wait2015-02-12 18:42:56.551898 7f95d4403700 0 -- 10.1.100.14:6874/84689 >> 10.1.100.1:6835/17116573 pipe(0xb593080 sd=832 :6874 s=0 pgs=0 cs=0 l=0 c=0xaf418c0).accept connect_seq0 vs existing 0 state wait2015-02-12 18:42:59.123753 7f7175bf7900 0 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578), process ceph-osd, pid 868602015-02-12 18:42:59.128606 7f7175bf7900 0 filestore(/var/lib/ceph/osd/ceph-538) backend xfs (magic 0x58465342)2015-02-12 18:42:59.128620 7f7175bf7900 1 filestore(/var/lib/ceph/osd/ceph-538) disabling 'filestore replica fadvise' due to known issues with fadvise(DONTNEED) on xfs2015-02-12 18:42:59.202824 7f7175bf7900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: FIEMAP ioctl is supported and appears to work2015-02-12 18:42:59.202851 7f7175bf7900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option2015-02-12 18:42:59.402460 7f7175bf7900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)2015-02-12 18:42:59.402541 7f7175bf7900 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_feature: extsize is disabled by conf2015-02-12 18:42:59.571199 7f7175bf7900 0 filestore(/var/lib/ceph/osd/ceph-538) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2015-02-12 18:42:59.576472 7f7175bf7900 1 journal _open /var/lib/ceph/osd/ceph-538/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-12 18:42:59.983516 7f7175bf7900 1 journal _open /var/lib/ceph/osd/ceph-538/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 12015-02-12 18:43:00.245124 7f7175bf7900 1 journal close /var/lib/ceph/osd/ceph-538/journal2015-02-12 18:43:00.348046 7f7175bf7900 0 filestore(/var/lib/ceph/osd/ceph-538) backend xfs (magic 0x58465342)2015-02-12 18:43:00.396662 7f7175bf7900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: FIEMAP ioctl is supported and appears to work2015-02-12 18:43:00.396682 7f7175bf7900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-538) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option[ 6216.144870] init: ceph-osd (ceph/538) main process ended, respawning[ 6223.548268] init: ceph-osd (ceph/538) main process (1035681) terminated with status 1[ 6223.548295] init: ceph-osd (ceph/538) main process ended, respawning[ 6230.306315] init: ceph-osd (ceph/538) main process (1037980) terminated with status 1[ 6230.306337] init: ceph-osd (ceph/538) main process ended, respawning[ 6239.132669] init: ceph-osd (ceph/538) main process (1040206) terminated with status 1[ 6239.132687] init: ceph-osd (ceph/538) main process ended, respawning[ 6245.699440] init: ceph-osd (ceph/538) main process (1042452) terminated with status 1[ 6245.699463] init: ceph-osd (ceph/538) main process ended, respawning[ 6254.057325] init: ceph-osd (ceph/538) main process (1044412) terminated with status 1[ 6254.057342] init: ceph-osd (ceph/538) main process ended, respawning[ 6261.686181] init: ceph-osd (ceph/538) main process (1046709) terminated with status 1[ 6261.686198] init: ceph-osd (ceph/538) main process ended, respawning[ 6269.204085] init: ceph-osd (ceph/538) main process (1049003) terminated with status 1[ 6269.204102] init: ceph-osd (ceph/538) main process ended, respawning[ 6276.458609] init: ceph-osd (ceph/538) main process (1051292) terminated with status 1[ 6276.458634] init: ceph-osd (ceph/538) main process ended, respawning[ 6283.972596] init: ceph-osd (ceph/538) main process (1053612) terminated with status 1[ 6283.972617] init: ceph-osd (ceph/538) main process ended, respawning[ 6291.281523] init: ceph-osd (ceph/538) main process (1055886) terminated with status 1[ 6291.281548] init: ceph-osd (ceph/538) main process ended, respawning[ 6299.595198] init: ceph-osd (ceph/538) main process (1058175) terminated with status 1[ 6299.595217] init: ceph-osd (ceph/538) main process ended, respawning[ 6307.142994] init: ceph-osd (ceph/538) main process (1060419) terminated with status 1[ 6307.143013] init: ceph-osd (ceph/538) main process ended, respawning--Regards
K.Mohamed Pakkeer_______________________________________________--
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com