Hi, My setup: - 5 physical nodes - 1 MON - 4 OSD - OS: CentOS 7 - Ceph: 0.94.1 - rbd_cache = false Lately I was benchmarking Ceph 0.94.1 rbd devices created in Memstore with fio + librbd and I encountered intersting crash: 1: /usr/bin/ceph-osd() [0xb81872] 2: (()+0xf130) [0x7f02fba00130] 3: (gsignal()+0x39) [0x7f02fa41a989] 4: (abort()+0x148) [0x7f02fa41c098] 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f02fad1e9d5] 6: (()+0x5e946) [0x7f02fad1c946] 7: (()+0x5e973) [0x7f02fad1c973] 8: (()+0x5eb9f) [0x7f02fad1cb9f] 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x137) [0xcf6057] 10: (void decode<unsigned long, unsigned long>(std::map<unsigned long, unsigned long, std::less<unsigned long>, std::allocator<std:: pair<unsigned long const, unsigned long> > >&, ceph::buffer::list::iterator&)+0x3e) [0x96c65e] 11: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, std::vector<OSDOp, std::allocator<OSDOp> >&)+0x7018) [0x930588] 12: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0xbf) [0x93e59f] 13: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x5f1) [0x93ee11] 14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>&)+0x45d7) [0x944bc7] 15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x68a) [0x8dd2fa] 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x409) [0x6cf3d9] 17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x32f) [0x6cf9ef] 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x86f) [0xc70f6f] 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xc730a0] 20: (()+0x7df3) [0x7f02fb9f8df3] 21: (clone()+0x6d) [0x7f02fa4db3dd] It was related to the implementation of Memstore::fiemap function called in ReplicatedPG::do_osd_op for sparse read command: when offset was bigger then object size then Memstore::fiemap was just returning '0', ReplicatedPG::do_osd_ops was interpreting it as "ok" and starting to decode bufferlist of extents, which was empty, and this was main cause of crash. It looks like Filestore isn't affected by this issue, fiemap syscall is returning with error, or when fiemap is not available it's just encoding [offset,length] (without *any* validation of input parameters). Now I'm wondering about best way to fix it, mimic behavior of fiemap (the long way)? Or implement it similar to the Filestore (with backend not supporting fiemap, the fast way)? Other thing is should OSD try to read past object's boundary? And my other question: I'm encountering this issue *every* time on physical cluster and I'm not able to reproduce it even *once* on vstart cluster with the same configuration. Is there any way to force sparse reads with librbd? -Lukas -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html