Dear all, Recently I have met a few osd crush in our production cluster. Dump core looks like as below: Loaded symbols for /usr/lib64/ceph/erasure-code/libec_jerasure.so.2.0.0 Reading symbols from /usr/lib64/libboost_system-mt.so.5...(no debugging symbols found)...done. Loaded symbols for /usr/lib64/libboost_system-mt.so.5 Core was generated by `ceph-osd -i 359'. Program terminated with signal 11, Segmentation fault. #0 ceph::log::Log::is_inside_log_lock (this=0x0) at log/Log.cc:361 361 pthread_self() == m_flush_mutex_holder; Missing separate debuginfos, use: debuginfo-install boost-system-1.41.0-18.el6.x86_64 boost-thread-1.41.0-18.el6.x86_64 bzip2-libs-1.0.5-7.el6_0.x86_64 glibc-2.12-1.166.el6_7.7.x86_64 gperftools-libs-2.0-11.el6.3.x86_64 leveldb-1.7.0-2.el6.x86_64 libaio-0.3.107-10.el6.x86_64 libgcc-4.4.7-11.el6.x86_64 libstdc++-4.4.7-11.el6.x86_64 libunwind-1.1-3.el6.x86_64 libuuid-2.17.2-12.14.el6.x86_64 lttng-ust-2.4.1-1.el6.x86_64 nspr-4.10.0-1.el6.x86_64 nss-3.15.1-15.el6.x86_64 nss-softokn-3.14.3-9.el6.x86_64 nss-softokn-freebl-3.14.3-9.el6.x86_64 nss-util-3.15.1-3.el6.x86_64 snappy-1.1.0-1.el6.x86_64 sqlite-3.6.20-1.el6.x86_64 userspace-rcu-0.7.7-1.el6.x86_64 zlib-1.2.3-29.el6.x86_64 (gdb) l 356 357 bool Log::is_inside_log_lock() 358 { 359 return 360 pthread_self() == m_queue_mutex_holder || 361 pthread_self() == m_flush_mutex_holder; 362 } 363 364 void Log::inject_segv() 365 { (gdb) bt #0 ceph::log::Log::is_inside_log_lock (this=0x0) at log/Log.cc:361 #1 0x0000000000bf0884 in handle_fatal_signal (signum=11) at global/signal_handler.cc:89 #2 <signal handler called> #3 0x0000003e1640ca71 in pthread_cancel () from /lib64/libpthread.so.0 #4 0x00007f11be3469b1 in lttng_ust_exit () from /usr/lib64/liblttng-ust.so.0 #5 0x00007f11be33ed9f in ?? () from /usr/lib64/liblttng-ust.so.0 #6 0x0000000000000022 in ?? () at /opt/centos/devtoolset-1.1/root/usr/include/c++/4.7.2/ext/atomicity.h:48 #7 0x0000000000000000 in ?? () My system is centos 6.5 (x86_64). Ceph version is Hammer 0.94.3 Does anyone met the same problem? Any advice is appreciated. Best Regards, Brandy -- Software Engineer, ChinaNetCenter Co., ShenZhen, Guangdong Province, China "Experience is the name everyone gives to their mistakes." -- Oscar Wilde -- -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html