Hi, since last thursday we had an ssd-pool (cache tier) in front of an ec-pool and fill the pools with data via rsync (app. 50MB/s). The ssd-pool has tree disks and one of them (an DC S3700) fails four times since that. I simply start the osd again and the pool pas rebuilded and work again for some hours up to some days. I switched the ceph-node and the ssh-adapter, but this don't solve the issue. There wasn't any messages in syslog/messages and an fsck runs without trouble, so I guess the problem is not OS-related. I found this issue http://tracker.ceph.com/issues/8747 but my ceph-version is newer (debian: ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)), and it's looks that i can reproduce this issue during 1-3 days. The osd is ext4-formatted. All other OSDs (62) runs without trouble. # more ceph-osd.61.log 2015-01-13 16:29:26.494458 7fedf9a3d700 0 log [INF] : 17.0 scrub ok 2015-01-13 17:29:03.988530 7fedf823a700 0 log [INF] : 17.16 scrub ok 2015-01-13 17:30:31.901032 7fedf8a3b700 0 log [INF] : 17.18 scrub ok 2015-01-13 17:31:58.983736 7fedf823a700 0 log [INF] : 17.9 scrub ok 2015-01-13 17:32:30.780308 7fedf9a3d700 0 log [INF] : 17.c scrub ok 2015-01-13 17:32:33.311433 7fedf8a3b700 0 log [INF] : 17.11 scrub ok 2015-01-13 17:37:22.237214 7fedf9a3d700 0 log [INF] : 17.7 scrub ok 2015-01-13 20:15:07.874376 7fedf6236700 -1 osd/ReplicatedPG.cc: In function 'void ReplicatedPG::finish_ctx(ReplicatedPG::OpContext*, int, bo ol)' thread 7fedf6236700 time 2015-01-13 20:15:07.853440 osd/ReplicatedPG.cc: 5306: FAILED assert(soid < scrubber.start || soid >= scrubber.end) ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) 1: (ReplicatedPG::finish_ctx(ReplicatedPG::OpContext*, int, bool)+0x1320) [0x9296b0] 2: (ReplicatedPG::try_flush_mark_clean(boost::shared_ptr<ReplicatedPG::FlushOp>)+0x5f6) [0x92b076] 3: (ReplicatedPG::finish_flush(hobject_t, unsigned long, int)+0x296) [0x92b876] 4: (C_Flush::finish(int)+0x86) [0x986226] 5: (Context::complete(int)+0x9) [0x78f449] 6: (Finisher::finisher_thread_entry()+0x1c8) [0xad5a18] 7: (()+0x6b50) [0x7fee152f6b50] 8: (clone()+0x6d) [0x7fee13f047bd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- -70> 2015-01-11 19:54:47.962164 7fee15dd4780 5 asok(0x2f56230) register_command perfcounters_dump hook 0x2f44010 -69> 2015-01-11 19:54:47.962190 7fee15dd4780 5 asok(0x2f56230) register_command 1 hook 0x2f44010 -68> 2015-01-11 19:54:47.962195 7fee15dd4780 5 asok(0x2f56230) register_command perf dump hook 0x2f44010 -67> 2015-01-11 19:54:47.962201 7fee15dd4780 5 asok(0x2f56230) register_command perfcounters_schema hook 0x2f44010 -66> 2015-01-11 19:54:47.962203 7fee15dd4780 5 asok(0x2f56230) register_command 2 hook 0x2f44010 -65> 2015-01-11 19:54:47.962207 7fee15dd4780 5 asok(0x2f56230) register_command perf schema hook 0x2f44010 -64> 2015-01-11 19:54:47.962209 7fee15dd4780 5 asok(0x2f56230) register_command config show hook 0x2f44010 -63> 2015-01-11 19:54:47.962214 7fee15dd4780 5 asok(0x2f56230) register_command config set hook 0x2f44010 -62> 2015-01-11 19:54:47.962219 7fee15dd4780 5 asok(0x2f56230) register_command config get hook 0x2f44010 -61> 2015-01-11 19:54:47.962223 7fee15dd4780 5 asok(0x2f56230) register_command log flush hook 0x2f44010 -60> 2015-01-11 19:54:47.962226 7fee15dd4780 5 asok(0x2f56230) register_command log dump hook 0x2f44010 -59> 2015-01-11 19:54:47.962229 7fee15dd4780 5 asok(0x2f56230) register_command log reopen hook 0x2f44010 -58> 2015-01-11 19:54:47.965000 7fee15dd4780 0 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3), process ceph-osd, pid 117 35 -57> 2015-01-11 19:54:47.967362 7fee15dd4780 1 finished global_init_daemonize -56> 2015-01-11 19:54:47.971666 7fee15dd4780 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-61) detect_features: FIEMAP ioctl is suppo rted and appears to work -55> 2015-01-11 19:54:47.971682 7fee15dd4780 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-61) detect_features: FIEMAP ioctl is disab led via 'filestore fiemap' config option -54> 2015-01-11 19:54:47.973281 7fee15dd4780 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-61) detect_features: syscall(SYS_syncfs, f d) fully supported -53> 2015-01-11 19:54:47.975393 7fee15dd4780 0 filestore(/var/lib/ceph/osd/ceph-61) limited size xattrs -52> 2015-01-11 19:54:48.013905 7fee15dd4780 0 filestore(/var/lib/ceph/osd/ceph-61) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled -51> 2015-01-11 19:54:49.245360 7fee15dd4780 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-61) detect_features: FIEMAP ioctl is suppo rted and appears to work -50> 2015-01-11 19:54:49.245370 7fee15dd4780 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-61) detect_features: FIEMAP ioctl is disab led via 'filestore fiemap' config option -49> 2015-01-11 19:54:49.247017 7fee15dd4780 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-61) detect_features: syscall(SYS_syncfs, f d) fully supported -48> 2015-01-11 19:54:49.248912 7fee15dd4780 0 filestore(/var/lib/ceph/osd/ceph-61) limited size xattrs -47> 2015-01-11 19:54:49.251863 7fee15dd4780 0 filestore(/var/lib/ceph/osd/ceph-61) mount: WRITEAHEAD journal mode explicitly enabled in conf -46> 2015-01-11 19:54:49.362965 7fee15dd4780 0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello -45> 2015-01-11 19:54:49.387439 7fee15dd4780 0 osd.61 116417 crush map has features 2303210029056, adjusting msgr requires for clients -44> 2015-01-11 19:54:49.387458 7fee15dd4780 0 osd.61 116417 crush map has features 2578087936000 was 8705, adjusting msgr requires for mons -43> 2015-01-11 19:54:49.387470 7fee15dd4780 0 osd.61 116417 crush map has features 2578087936000, adjusting msgr requires for osds -42> 2015-01-11 19:54:49.387484 7fee15dd4780 0 osd.61 116417 load_pgs -41> 2015-01-11 19:54:50.206711 7fee15dd4780 0 osd.61 116417 load_pgs opened 32 pgs -40> 2015-01-11 19:54:50.211664 7fee02a4f700 0 osd.61 116417 ignoring osdmap until we have initialized -39> 2015-01-11 19:54:50.211752 7fee02a4f700 0 osd.61 116417 ignoring osdmap until we have initialized -38> 2015-01-11 19:54:50.212428 7fee15dd4780 0 osd.61 116417 done with init, starting boot process .... Udo _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com