osds crashing on Thread::create

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



first off, hello all. this is my first time posting to the list.

i have seen a recurring problem that has starting in the past week or so on one of my ceph clusters. osds will crash and it seems to happen whenever backfill or recovery is started. looking at the logs it appears that the the osd is asserting in src/common/Thread.cc when it tries to create a new thread. these osds are running 0.94.5 and i believe https://github.com/ceph/ceph/blob/v0.94.5/src/common/Thread.cc#L129 is the assert that is being hit. i looked back through the code for a couple minutes and it looks like its asserting on pthread_create returning something besides 0. i'm not sure why pthread_create would be failing and it looks like it just writes what the return code is to stderr. i also wasn't able to determine where the output of stderr ended up from my osds. it looks like from looking at /proc/<pid>/fd/{0,1,2} and lsof that stderr is a unix socket but i don't see where it goes after that. the osds are started by ceph-disk activate.

do any of you have any ideas as to what might be causing this? or how i might further troubleshoot this? i'm attaching a trimmed version of the osd log. i removed some extraneous bits from after the osds was restarted and a large amount of 'recent events' that were from well before the crash.

thanks

mike
2016-03-07 10:51:08.739907 7fb0c56a1700  0 -- 10.208.16.26:6802/7034 >> 10.208.16.42:0/3019478 pipe(0x47aab000 sd=1360 :6802 s=0 pgs=0 cs=0 l=1 c=0x1c96f080).accept replacing existing (lossy) channel (new one lossy=1)
2016-03-07 12:09:27.465221 7fb0d5a7f700  0 -- 10.208.12.26:6802/7034 >> 10.208.12.26:6800/2245 pipe(0x2b1ce000 sd=1228 :6802 s=0 pgs=0 cs=0 l=0 c=0x55dd020).accept connect_seq 10 vs existing 10 state standby
2016-03-07 12:09:27.465290 7fb0c9f05700  0 -- 10.208.12.26:6802/7034 >> 10.208.12.26:6807/1020674 pipe(0x3bce0000 sd=1259 :6802 s=0 pgs=0 cs=0 l=0 c=0x55da2c0).accept connect_seq 13 vs existing 13 state standby
2016-03-07 12:09:27.466053 7fb0c9f05700  0 -- 10.208.12.26:6802/7034 >> 10.208.12.26:6807/1020674 pipe(0x3bce0000 sd=1259 :6802 s=0 pgs=0 cs=0 l=0 c=0x55da2c0).accept connect_seq 14 vs existing 13 state standby
2016-03-07 12:09:27.466106 7fb0d5a7f700  0 -- 10.208.12.26:6802/7034 >> 10.208.12.26:6800/2245 pipe(0x2b1ce000 sd=1228 :6802 s=0 pgs=0 cs=0 l=0 c=0x55dd020).accept connect_seq 11 vs existing 10 state standby
2016-03-07 12:09:28.358657 7fb135fd8700  0 -- 10.208.12.26:6802/7034 >> 10.208.12.24:6800/26784 pipe(0x534c6000 sd=1311 :56029 s=2 pgs=35 cs=1 l=0 c=0x359f75a0).fault with nothing to send, going to standby
2016-03-07 12:09:28.359955 7fb0b0e36700  0 -- 10.208.12.26:0/7034 >> 10.208.12.24:6801/26784 pipe(0x3b466000 sd=1103 :0 s=1 pgs=0 cs=0 l=1 c=0x535351e0).fault
2016-03-07 12:09:28.360759 7fb0b0d35700  0 -- 10.208.12.26:0/7034 >> 10.208.16.24:6801/26784 pipe(0x5ddc2000 sd=1157 :0 s=1 pgs=0 cs=0 l=1 c=0x53535340).fault
2016-03-07 12:09:28.469563 7fb160395700  0 log_channel(cluster) log [INF] : 13.671 restarting backfill on osd.166 from (116308'3515965,116359'3519022] MAX to 117741'3716137
2016-03-07 12:09:28.469613 7fb15fb94700  0 log_channel(cluster) log [INF] : 13.5e3 restarting backfill on osd.166 from (116308'3293585,116359'3296666] MAX to 117741'3476172
2016-03-07 12:09:28.478230 7fb160395700  0 log_channel(cluster) log [INF] : 13.42d restarting backfill on osd.166 from (116308'5692257,116359'5695353] MAX to 117741'5986232
2016-03-07 12:09:28.479461 7fb15fb94700  0 log_channel(cluster) log [INF] : 13.6da restarting backfill on osd.166 from (116308'3186858,116359'3189912] MAX to 117741'3327862
2016-03-07 12:09:28.493689 7fb15fb94700  0 log_channel(cluster) log [INF] : 13.6da restarting backfill on osd.190 from (0'0,0'0] MAX to 117741'3327862
2016-03-07 12:09:28.508933 7fb160395700  0 log_channel(cluster) log [INF] : 13.791 restarting backfill on osd.166 from (116308'4423278,116359'4426295] MAX to 117741'4593697
2016-03-07 12:09:28.603153 7fb1338b1700  0 -- 10.208.12.26:6802/7034 >> 10.208.12.24:6800/26784 pipe(0x534c6000 sd=1157 :56029 s=1 pgs=35 cs=1 l=0 c=0x359f75a0).fault
2016-03-07 12:09:29.482123 7fb15fb94700  0 log_channel(cluster) log [INF] : 13.650 restarting backfill on osd.166 from (116308'2664728,116359'2667752] MAX to 117741'2770454
2016-03-07 12:09:45.637558 7fb170d22700 -1 osd.175 117745 heartbeat_check: no reply from osd.141 since back 2016-03-07 12:09:25.523264 front 2016-03-07 12:09:25.523264 (cutoff 2016-03-07 12:09:25.637555)
2016-03-07 12:09:46.638025 7fb170d22700 -1 osd.175 117745 heartbeat_check: no reply from osd.141 since back 2016-03-07 12:09:25.523264 front 2016-03-07 12:09:25.523264 (cutoff 2016-03-07 12:09:26.638022)
2016-03-07 12:09:47.638159 7fb170d22700 -1 osd.175 117745 heartbeat_check: no reply from osd.141 since back 2016-03-07 12:09:25.523264 front 2016-03-07 12:09:25.523264 (cutoff 2016-03-07 12:09:27.638154)
2016-03-07 12:09:48.231986 7fb158b86700 -1 osd.175 117745 heartbeat_check: no reply from osd.141 since back 2016-03-07 12:09:25.523264 front 2016-03-07 12:09:25.523264 (cutoff 2016-03-07 12:09:28.231984)
2016-03-07 12:09:48.638412 7fb170d22700 -1 osd.175 117745 heartbeat_check: no reply from osd.141 since back 2016-03-07 12:09:25.523264 front 2016-03-07 12:09:25.523264 (cutoff 2016-03-07 12:09:28.638408)
2016-03-07 12:14:50.864247 7fb0b0d35700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7fb0b0d35700 time 2016-03-07 12:14:50.821645
common/Thread.cc: 129: FAILED assert(ret == 0)

 ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0xbc9d85]
 2: (Thread::create(unsigned long)+0x8a) [0xbad54a]
 3: (Pipe::connect()+0x30d4) [0xca76e4]
 4: (Pipe::writer()+0x4ea) [0xca94ea]
 5: (Pipe::Writer::entry()+0xd) [0xcb46ed]
 6: (()+0x7df5) [0x7fb178e8fdf5]
 7: (clone()+0x6d) [0x7fb1779721ad]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
<SNIP>
  -100> 2016-03-07 11:35:19.651846 7fb1758b7700  1 do_command '1' '
   -99> 2016-03-07 11:35:19.652306 7fb1758b7700  1 do_command '1' 'result is 15751 bytes
   -98> 2016-03-07 11:36:19.651309 7fb1758b7700  1 do_command '1' '
   -97> 2016-03-07 11:36:19.651756 7fb1758b7700  1 do_command '1' 'result is 15769 bytes
   -96> 2016-03-07 11:37:19.650634 7fb1758b7700  1 do_command '1' '
   -95> 2016-03-07 11:37:19.651082 7fb1758b7700  1 do_command '1' 'result is 15749 bytes
   -94> 2016-03-07 11:38:19.650457 7fb1758b7700  1 do_command '1' '
   -93> 2016-03-07 11:38:19.650915 7fb1758b7700  1 do_command '1' 'result is 15749 bytes
   -92> 2016-03-07 11:39:19.650435 7fb1758b7700  1 do_command '1' '
   -91> 2016-03-07 11:39:19.650879 7fb1758b7700  1 do_command '1' 'result is 15753 bytes
   -90> 2016-03-07 11:40:19.650433 7fb1758b7700  1 do_command '1' '
   -89> 2016-03-07 11:40:19.651016 7fb1758b7700  1 do_command '1' 'result is 15756 bytes
   -88> 2016-03-07 11:41:19.650638 7fb1758b7700  1 do_command '1' '
   -87> 2016-03-07 11:41:19.651082 7fb1758b7700  1 do_command '1' 'result is 15750 bytes
   -86> 2016-03-07 11:42:19.650436 7fb1758b7700  1 do_command '1' '
   -85> 2016-03-07 11:42:19.650934 7fb1758b7700  1 do_command '1' 'result is 15758 bytes
   -84> 2016-03-07 11:43:19.650447 7fb1758b7700  1 do_command '1' '
   -83> 2016-03-07 11:43:19.650892 7fb1758b7700  1 do_command '1' 'result is 15748 bytes
   -82> 2016-03-07 11:44:19.652266 7fb1758b7700  1 do_command '1' '
   -81> 2016-03-07 11:44:19.652718 7fb1758b7700  1 do_command '1' 'result is 15752 bytes
   -80> 2016-03-07 11:45:19.650441 7fb1758b7700  1 do_command '1' '
   -79> 2016-03-07 11:45:19.651208 7fb1758b7700  1 do_command '1' 'result is 15760 bytes
   -78> 2016-03-07 11:46:19.650444 7fb1758b7700  1 do_command '1' '
   -77> 2016-03-07 11:46:19.650873 7fb1758b7700  1 do_command '1' 'result is 15764 bytes
   -76> 2016-03-07 11:47:19.652258 7fb1758b7700  1 do_command '1' '
   -75> 2016-03-07 11:47:19.652563 7fb1758b7700  1 do_command '1' 'result is 15750 bytes
   -74> 2016-03-07 11:48:19.650457 7fb1758b7700  1 do_command '1' '
   -73> 2016-03-07 11:48:19.651595 7fb1758b7700  1 do_command '1' 'result is 15752 bytes
   -72> 2016-03-07 11:49:19.650505 7fb1758b7700  1 do_command '1' '
   -71> 2016-03-07 11:49:19.650954 7fb1758b7700  1 do_command '1' 'result is 15756 bytes
   -70> 2016-03-07 11:50:19.651853 7fb1758b7700  1 do_command '1' '
   -69> 2016-03-07 11:50:19.652173 7fb1758b7700  1 do_command '1' 'result is 15761 bytes
   -68> 2016-03-07 11:51:19.650829 7fb1758b7700  1 do_command '1' '
   -67> 2016-03-07 11:51:19.651298 7fb1758b7700  1 do_command '1' 'result is 15750 bytes
   -66> 2016-03-07 11:52:19.650442 7fb1758b7700  1 do_command '1' '
   -65> 2016-03-07 11:52:19.650796 7fb1758b7700  1 do_command '1' 'result is 15750 bytes
   -64> 2016-03-07 11:53:19.650437 7fb1758b7700  1 do_command '1' '
   -63> 2016-03-07 11:53:19.650737 7fb1758b7700  1 do_command '1' 'result is 15753 bytes
   -62> 2016-03-07 11:54:19.650420 7fb1758b7700  1 do_command '1' '
   -61> 2016-03-07 11:54:19.650913 7fb1758b7700  1 do_command '1' 'result is 15749 bytes
   -60> 2016-03-07 11:55:19.650406 7fb1758b7700  1 do_command '1' '
   -59> 2016-03-07 11:55:19.650877 7fb1758b7700  1 do_command '1' 'result is 15753 bytes
   -58> 2016-03-07 11:56:19.650393 7fb1758b7700  1 do_command '1' '
   -57> 2016-03-07 11:56:19.650840 7fb1758b7700  1 do_command '1' 'result is 15749 bytes
   -56> 2016-03-07 11:57:19.650449 7fb1758b7700  1 do_command '1' '
   -55> 2016-03-07 11:57:19.650902 7fb1758b7700  1 do_command '1' 'result is 15752 bytes
   -54> 2016-03-07 11:58:19.650454 7fb1758b7700  1 do_command '1' '
   -53> 2016-03-07 11:58:19.650915 7fb1758b7700  1 do_command '1' 'result is 15751 bytes
   -52> 2016-03-07 11:59:19.650396 7fb1758b7700  1 do_command '1' '
   -51> 2016-03-07 11:59:19.650850 7fb1758b7700  1 do_command '1' 'result is 15748 bytes
   -50> 2016-03-07 12:00:19.651558 7fb1758b7700  1 do_command '1' '
   -49> 2016-03-07 12:00:19.652050 7fb1758b7700  1 do_command '1' 'result is 15765 bytes
   -48> 2016-03-07 12:01:19.650460 7fb1758b7700  1 do_command '1' '
   -47> 2016-03-07 12:01:19.650801 7fb1758b7700  1 do_command '1' 'result is 15754 bytes
   -46> 2016-03-07 12:02:19.650433 7fb1758b7700  1 do_command '1' '
   -45> 2016-03-07 12:02:19.650869 7fb1758b7700  1 do_command '1' 'result is 15760 bytes
   -44> 2016-03-07 12:03:19.650485 7fb1758b7700  1 do_command '1' '
   -43> 2016-03-07 12:03:19.650930 7fb1758b7700  1 do_command '1' 'result is 15749 bytes
   -42> 2016-03-07 12:04:19.652754 7fb1758b7700  1 do_command '1' '
   -41> 2016-03-07 12:04:19.653050 7fb1758b7700  1 do_command '1' 'result is 15750 bytes
   -40> 2016-03-07 12:05:19.650525 7fb1758b7700  1 do_command '1' '
   -39> 2016-03-07 12:05:19.650954 7fb1758b7700  1 do_command '1' 'result is 15754 bytes
   -38> 2016-03-07 12:06:19.652863 7fb1758b7700  1 do_command '1' '
   -37> 2016-03-07 12:06:19.653323 7fb1758b7700  1 do_command '1' 'result is 15750 bytes
   -36> 2016-03-07 12:07:19.653147 7fb1758b7700  1 do_command '1' '
   -35> 2016-03-07 12:07:19.653472 7fb1758b7700  1 do_command '1' 'result is 15756 bytes
   -34> 2016-03-07 12:08:19.650447 7fb1758b7700  1 do_command '1' '
   -33> 2016-03-07 12:08:19.650900 7fb1758b7700  1 do_command '1' 'result is 15766 bytes
   -32> 2016-03-07 12:09:19.651939 7fb1758b7700  1 do_command '1' '
   -31> 2016-03-07 12:09:19.652310 7fb1758b7700  1 do_command '1' 'result is 15756 bytes
   -30> 2016-03-07 12:09:27.465221 7fb0d5a7f700  0 -- 10.208.12.26:6802/7034 >> 10.208.12.26:6800/2245 pipe(0x2b1ce000 sd=1228 :6802 s=0 pgs=0 cs=0 l=0 c=0x55dd020).accept connect_seq 10 vs existing 10 state standby
   -29> 2016-03-07 12:09:27.465290 7fb0c9f05700  0 -- 10.208.12.26:6802/7034 >> 10.208.12.26:6807/1020674 pipe(0x3bce0000 sd=1259 :6802 s=0 pgs=0 cs=0 l=0 c=0x55da2c0).accept connect_seq 13 vs existing 13 state standby
   -28> 2016-03-07 12:09:27.466053 7fb0c9f05700  0 -- 10.208.12.26:6802/7034 >> 10.208.12.26:6807/1020674 pipe(0x3bce0000 sd=1259 :6802 s=0 pgs=0 cs=0 l=0 c=0x55da2c0).accept connect_seq 14 vs existing 13 state standby
   -27> 2016-03-07 12:09:27.466106 7fb0d5a7f700  0 -- 10.208.12.26:6802/7034 >> 10.208.12.26:6800/2245 pipe(0x2b1ce000 sd=1228 :6802 s=0 pgs=0 cs=0 l=0 c=0x55dd020).accept connect_seq 11 vs existing 10 state standby
   -26> 2016-03-07 12:09:28.358657 7fb135fd8700  0 -- 10.208.12.26:6802/7034 >> 10.208.12.24:6800/26784 pipe(0x534c6000 sd=1311 :56029 s=2 pgs=35 cs=1 l=0 c=0x359f75a0).fault with nothing to send, going to standby
   -25> 2016-03-07 12:09:28.359955 7fb0b0e36700  0 -- 10.208.12.26:0/7034 >> 10.208.12.24:6801/26784 pipe(0x3b466000 sd=1103 :0 s=1 pgs=0 cs=0 l=1 c=0x535351e0).fault
   -24> 2016-03-07 12:09:28.360759 7fb0b0d35700  0 -- 10.208.12.26:0/7034 >> 10.208.16.24:6801/26784 pipe(0x5ddc2000 sd=1157 :0 s=1 pgs=0 cs=0 l=1 c=0x53535340).fault
   -23> 2016-03-07 12:09:28.469563 7fb160395700  0 log_channel(cluster) log [INF] : 13.671 restarting backfill on osd.166 from (116308'3515965,116359'3519022] MAX to 117741'3716137
   -22> 2016-03-07 12:09:28.469613 7fb15fb94700  0 log_channel(cluster) log [INF] : 13.5e3 restarting backfill on osd.166 from (116308'3293585,116359'3296666] MAX to 117741'3476172
   -21> 2016-03-07 12:09:28.478230 7fb160395700  0 log_channel(cluster) log [INF] : 13.42d restarting backfill on osd.166 from (116308'5692257,116359'5695353] MAX to 117741'5986232
   -20> 2016-03-07 12:09:28.479461 7fb15fb94700  0 log_channel(cluster) log [INF] : 13.6da restarting backfill on osd.166 from (116308'3186858,116359'3189912] MAX to 117741'3327862
   -19> 2016-03-07 12:09:28.493689 7fb15fb94700  0 log_channel(cluster) log [INF] : 13.6da restarting backfill on osd.190 from (0'0,0'0] MAX to 117741'3327862
   -18> 2016-03-07 12:09:28.508933 7fb160395700  0 log_channel(cluster) log [INF] : 13.791 restarting backfill on osd.166 from (116308'4423278,116359'4426295] MAX to 117741'4593697
   -17> 2016-03-07 12:09:28.603153 7fb1338b1700  0 -- 10.208.12.26:6802/7034 >> 10.208.12.24:6800/26784 pipe(0x534c6000 sd=1157 :56029 s=1 pgs=35 cs=1 l=0 c=0x359f75a0).fault
   -16> 2016-03-07 12:09:29.482123 7fb15fb94700  0 log_channel(cluster) log [INF] : 13.650 restarting backfill on osd.166 from (116308'2664728,116359'2667752] MAX to 117741'2770454
   -15> 2016-03-07 12:09:45.637558 7fb170d22700 -1 osd.175 117745 heartbeat_check: no reply from osd.141 since back 2016-03-07 12:09:25.523264 front 2016-03-07 12:09:25.523264 (cutoff 2016-03-07 12:09:25.637555)
   -14> 2016-03-07 12:09:46.638025 7fb170d22700 -1 osd.175 117745 heartbeat_check: no reply from osd.141 since back 2016-03-07 12:09:25.523264 front 2016-03-07 12:09:25.523264 (cutoff 2016-03-07 12:09:26.638022)
   -13> 2016-03-07 12:09:47.638159 7fb170d22700 -1 osd.175 117745 heartbeat_check: no reply from osd.141 since back 2016-03-07 12:09:25.523264 front 2016-03-07 12:09:25.523264 (cutoff 2016-03-07 12:09:27.638154)
   -12> 2016-03-07 12:09:48.231986 7fb158b86700 -1 osd.175 117745 heartbeat_check: no reply from osd.141 since back 2016-03-07 12:09:25.523264 front 2016-03-07 12:09:25.523264 (cutoff 2016-03-07 12:09:28.231984)
   -11> 2016-03-07 12:09:48.638412 7fb170d22700 -1 osd.175 117745 heartbeat_check: no reply from osd.141 since back 2016-03-07 12:09:25.523264 front 2016-03-07 12:09:25.523264 (cutoff 2016-03-07 12:09:28.638408)
   -10> 2016-03-07 12:10:19.650574 7fb1758b7700  1 do_command '1' '
    -9> 2016-03-07 12:10:19.650993 7fb1758b7700  1 do_command '1' 'result is 15757 bytes
    -8> 2016-03-07 12:11:19.650489 7fb1758b7700  1 do_command '1' '
    -7> 2016-03-07 12:11:19.650937 7fb1758b7700  1 do_command '1' 'result is 15749 bytes
    -6> 2016-03-07 12:12:19.651401 7fb1758b7700  1 do_command '1' '
    -5> 2016-03-07 12:12:19.651701 7fb1758b7700  1 do_command '1' 'result is 15750 bytes
    -4> 2016-03-07 12:13:19.650556 7fb1758b7700  1 do_command '1' '
    -3> 2016-03-07 12:13:19.650985 7fb1758b7700  1 do_command '1' 'result is 15755 bytes
    -2> 2016-03-07 12:14:19.650449 7fb1758b7700  1 do_command '1' '
    -1> 2016-03-07 12:14:19.650887 7fb1758b7700  1 do_command '1' 'result is 15754 bytes
     0> 2016-03-07 12:14:50.864247 7fb0b0d35700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7fb0b0d35700 time 2016-03-07 12:14:50.821645
common/Thread.cc: 129: FAILED assert(ret == 0)

 ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0xbc9d85]
 2: (Thread::create(unsigned long)+0x8a) [0xbad54a]
 3: (Pipe::connect()+0x30d4) [0xca76e4]
 4: (Pipe::writer()+0x4ea) [0xca94ea]
 5: (Pipe::Writer::entry()+0xd) [0xcb46ed]
 6: (()+0x7df5) [0x7fb178e8fdf5]
 7: (clone()+0x6d) [0x7fb1779721ad]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 0 lockdep
   0/ 0 context
   0/ 0 crush
   0/ 0 mds
   0/ 0 mds_balancer
   0/ 0 mds_locker
   0/ 0 mds_log
   0/ 0 mds_log_expire
   0/ 0 mds_migrator
   0/ 0 buffer
   0/ 0 timer
   0/ 0 filer
   0/ 1 striper
   0/ 0 objecter
   0/ 0 rados
   0/ 0 rbd
   0/ 5 rbd_replay
   0/ 0 journaler
   0/ 0 objectcacher
   0/ 0 client
   0/ 0 osd
   0/ 0 optracker
   0/ 0 objclass
   0/ 0 filestore
   1/ 3 keyvaluestore
   0/ 0 journal
   0/ 0 ms
   0/ 0 mon
   0/ 0 monc
   0/ 0 paxos
   0/ 0 tp
   0/ 0 auth
   1/ 5 crypto
   0/ 0 finisher
   0/ 0 heartbeatmap
   0/ 0 perfcounter
   0/ 0 rgw
   1/10 civetweb
   1/ 5 javaclient
   0/ 0 asok
   0/ 0 throttle
   0/ 0 refs
   1/ 5 xio
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.175.log
--- end dump of recent events ---
2016-03-07 12:14:50.864098 7fb170d22700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7fb170d22700 time 2016-03-07 12:14:50.821743
common/Thread.cc: 129: FAILED assert(ret == 0)

 ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0xbc9d85]
 2: (Thread::create(unsigned long)+0x8a) [0xbad54a]
 3: (SimpleMessenger::connect_rank(entity_addr_t const&, int, PipeConnection*, Message*)+0x185) [0xba44f5]
 4: (SimpleMessenger::get_connection(entity_inst_t const&)+0x448) [0xba4d78]
 5: (OSDService::get_con_osd_hb(int, unsigned int)+0x1f6) [0x66d7f6]
 6: (OSD::_add_heartbeat_peer(int)+0xb7) [0x68b5d7]
 7: (OSD::maybe_update_heartbeat_peers()+0xbb0) [0x68c700]
 8: (OSD::tick()+0x1fe) [0x6b101e]
 9: (Context::complete(int)+0x9) [0x6c0099]
 10: (SafeTimer::timer_thread()+0x104) [0xbb2964]
 11: (SafeTimerThread::entry()+0xd) [0xbb391d]
 12: (()+0x7df5) [0x7fb178e8fdf5]
 13: (clone()+0x6d) [0x7fb1779721ad]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
     0> 2016-03-07 12:14:50.864098 7fb170d22700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7fb170d22700 time 2016-03-07 12:14:50.821743
common/Thread.cc: 129: FAILED assert(ret == 0)

 ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0xbc9d85]
 2: (Thread::create(unsigned long)+0x8a) [0xbad54a]
 3: (SimpleMessenger::connect_rank(entity_addr_t const&, int, PipeConnection*, Message*)+0x185) [0xba44f5]
 4: (SimpleMessenger::get_connection(entity_inst_t const&)+0x448) [0xba4d78]
 5: (OSDService::get_con_osd_hb(int, unsigned int)+0x1f6) [0x66d7f6]
 6: (OSD::_add_heartbeat_peer(int)+0xb7) [0x68b5d7]
 7: (OSD::maybe_update_heartbeat_peers()+0xbb0) [0x68c700]
 8: (OSD::tick()+0x1fe) [0x6b101e]
 9: (Context::complete(int)+0x9) [0x6c0099]
 10: (SafeTimer::timer_thread()+0x104) [0xbb2964]
 11: (SafeTimerThread::entry()+0xd) [0xbb391d]
 12: (()+0x7df5) [0x7fb178e8fdf5]
 13: (clone()+0x6d) [0x7fb1779721ad]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 0 lockdep
   0/ 0 context
   0/ 0 crush
   0/ 0 mds
   0/ 0 mds_balancer
   0/ 0 mds_locker
   0/ 0 mds_log
   0/ 0 mds_log_expire
   0/ 0 mds_migrator
   0/ 0 buffer
   0/ 0 timer
   0/ 0 filer
   0/ 1 striper
   0/ 0 objecter
   0/ 0 rados
   0/ 0 rbd
   0/ 5 rbd_replay
   0/ 0 journaler
   0/ 0 objectcacher
   0/ 0 client
   0/ 0 osd
   0/ 0 optracker
   0/ 0 objclass
   0/ 0 filestore
   1/ 3 keyvaluestore
   0/ 0 journal
   0/ 0 ms
   0/ 0 mon
   0/ 0 monc
   0/ 0 paxos
   0/ 0 tp
   0/ 0 auth
   1/ 5 crypto
   0/ 0 finisher
   0/ 0 heartbeatmap
   0/ 0 perfcounter
   0/ 0 rgw
   1/10 civetweb
   1/ 5 javaclient
   0/ 0 asok
   0/ 0 throttle
   0/ 0 refs
   1/ 5 xio
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.175.log
--- end dump of recent events ---
2016-03-07 12:18:58.917246 7f0136683880  0 ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43), process ceph-osd, pid 14772
2016-03-07 12:18:58.953204 7f0136683880  0 filestore(/var/lib/ceph/osd/ceph-175) backend generic (magic 0xef53)
2016-03-07 12:18:58.954279 7f0136683880  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-175) detect_features: FIEMAP ioctl is supported and appears to work
2016-03-07 12:18:58.954289 7f0136683880  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-175) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
2016-03-07 12:18:58.962463 7f0136683880  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-175) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2016-03-07 12:18:58.981991 7f0136683880  0 filestore(/var/lib/ceph/osd/ceph-175) limited size xattrs
2016-03-07 12:18:59.399525 7f0136683880  0 filestore(/var/lib/ceph/osd/ceph-175) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
2016-03-07 12:18:59.769470 7f0136683880  0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello
2016-03-07 12:18:59.770883 7f0136683880  0 osd.175 117801 crush map has features 281510444007424, adjusting msgr requires for clients
2016-03-07 12:18:59.770896 7f0136683880  0 osd.175 117801 crush map has features 281510444007424 was 8705, adjusting msgr requires for mons
2016-03-07 12:18:59.770902 7f0136683880  0 osd.175 117801 crush map has features 281510444007424, adjusting msgr requires for osds
2016-03-07 12:18:59.770916 7f0136683880  0 osd.175 117801 load_pgs
2016-03-07 12:19:03.412674 7f0136683880  0 osd.175 117801 load_pgs opened 143 pgs
2016-03-07 12:19:03.413296 7f0136683880 -1 osd.175 117801 log_to_monitors {default=true}
2016-03-07 12:19:03.417774 7f01229ad700  0 osd.175 117801 ignoring osdmap until we have initialized
2016-03-07 12:19:03.417979 7f01229ad700  0 osd.175 117801 ignoring osdmap until we have initialized
2016-03-07 12:19:03.433830 7f0136683880  0 osd.175 117801 done with init, starting boot process
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux