Le 27/09/2015 10:25, Lionel Bouton a écrit : > Le 27/09/2015 09:15, Lionel Bouton a écrit : >> Hi, >> >> we just had a quasi simultaneous crash on two different OSD which >> blocked our VMs (min_size = 2, size = 3) on Firefly 0.80.9. >> >> the first OSD to go down had this error : >> >> 2015-09-27 06:30:33.257133 7f7ac7fef700 -1 os/FileStore.cc: In function >> 'virtual int FileStore::read(coll_t, const ghobject_t&, uint64_t, >> size_t, ceph::bufferlist&, bool)' thread 7f7ac7fef700 time 2015-09-27 >> 06:30:33.145251 >> os/FileStore.cc: 2641: FAILED assert(allow_eio || !m_filestore_fail_eio >> || got != -5) >> >> the second OSD crash was similar : >> >> 2015-09-27 06:30:57.373841 7f05d92cf700 -1 os/FileStore.cc: In function >> 'virtual int FileStore::read(coll_t, const ghobject_t&, uint64_t, >> size_t, ceph::bufferlist&, bool)' thread 7f05d92cf700 time 2015-09-27 >> 06:30:57.260978 >> os/FileStore.cc: 2641: FAILED assert(allow_eio || !m_filestore_fail_eio >> || got != -5) >> >> I'm familiar with this error : it happened already with a BTRFS read >> error (invalid csum) and I could correct it after flush-journal/deleting >> the corrupted file/starting OSD/pg repair. >> This time though there isn't any kernel log indicating an invalid csum. >> The kernel is different though : we use 3.18.9 on these two servers and >> the others had 4.0.5 so maybe BTRFS doesn't log invalid checksum errors >> with this version. I've launched btrfs scrub on the 2 filesystems just >> in case (still waiting for completion). >> >> The first attempt to restart these OSDs failed: one OSD died 19 seconds >> after start, the other 21 seconds. Seeing that, I temporarily brought >> down the min_size to 1 which allowed the 9 incomplete PG to recover. I >> verified this by bringing min_size again to 2 and then restarted the 2 >> OSDs. They didn't crash yet. >> >> For reference the assert failures were still the same when the OSD died >> shortly after start : >> 2015-09-27 08:20:19.332835 7f4467bd0700 -1 os/FileStore.cc: In function >> 'virtual int FileStore::read(coll_t, const ghobject_t&, uint64_t, >> size_t, ceph::bufferlist&, bool)' thread 7f4467bd0700 time 2015-09-27 >> 08:20:19.325126 >> os/FileStore.cc: 2641: FAILED assert(allow_eio || !m_filestore_fail_eio >> || got != -5) >> >> 2015-09-27 08:20:50.626344 7f97f2d95700 -1 os/FileStore.cc: In function >> 'virtual int FileStore::read(coll_t, const ghobject_t&, uint64_t, >> size_t, ceph::bufferlist&, bool)' thread 7f97f2d95700 time 2015-09-27 >> 08:20:50.605234 >> os/FileStore.cc: 2641: FAILED assert(allow_eio || !m_filestore_fail_eio >> || got != -5) >> >> Note that at 2015-09-27 06:30:11 a deep-scrub started on a PG involving >> one (and only one) of these 2 OSD. As we evenly space deep-scrubs (with >> currently a 10 minute interval), this might be relevant (or just a >> coincidence). >> >> I made copies of the ceph osd logs (including the stack trace and the >> recent events) if needed. >> >> Can anyone put some light on why these OSDs died ? > I just had a thought. Could launching a defragmentation on a file in a > BTRFS OSD filestore trigger this problem? That's not it : we had another crash a couple of hours ago on one of the two servers involved in the first crashes and there was no concurrent defragmentation going on. 2015-09-29 14:18:53.479881 7f8d78ff9700 -1 os/FileStore.cc: In function 'virtual int FileStore::read(coll_t, const ghobject_t&, uint64_t, size_t, ceph::bufferlist&, bool)' thread 7f8d78ff9700 time 2015-09-29 14:18:53.425790 os/FileStore.cc: 2641: FAILED assert(allow_eio || !m_filestore_fail_eio || got != -5) ceph version 0.80.9 (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047) 1: (FileStore::read(coll_t, ghobject_t const&, unsigned long, unsigned long, ceph::buffer::list&, bool)+0x96a) [0x8917ea] 2: (ReplicatedBackend::objects_read_sync(hobject_t const&, unsigned long, unsigned long, ceph::buffer::list*)+0x81) [0x90ecc1] 3: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, std::vector<OSDOp, std::allocator<OSDOp> >&)+0x6a81) [0x801091] 4: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x63) [0x809f23] 5: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0xb6f) [0x80adbf] 6: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x2ced) [0x815f4d] 7: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x70c) [0x7b047c] 8: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x34a) [0x60c74a] 9: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x628) [0x628808] 10: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x66ea8c] 11: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0xa60416] 12: (ThreadPool::WorkThread::entry()+0x10) [0xa62430] 13: (()+0x8217) [0x7f8dae984217] 14: (clone()+0x6d) [0x7f8dad129f8d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. For the 2 previous crashes, I launched btrfs scrubs and it couldn't find any problem. Could someone help diagnose what is going on? Is it a known bug? Best regards, Lionel _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com