It's an EIO. The osd got an EIO from the underlying fs. That's what causes those asserts. You probably want to redirect to the relevant fs maling list. -Sam On Tue, Sep 29, 2015 at 7:42 AM, Lionel Bouton <lionel-subscription@xxxxxxxxxxx> wrote: > Le 27/09/2015 10:25, Lionel Bouton a écrit : >> Le 27/09/2015 09:15, Lionel Bouton a écrit : >>> Hi, >>> >>> we just had a quasi simultaneous crash on two different OSD which >>> blocked our VMs (min_size = 2, size = 3) on Firefly 0.80.9. >>> >>> the first OSD to go down had this error : >>> >>> 2015-09-27 06:30:33.257133 7f7ac7fef700 -1 os/FileStore.cc: In function >>> 'virtual int FileStore::read(coll_t, const ghobject_t&, uint64_t, >>> size_t, ceph::bufferlist&, bool)' thread 7f7ac7fef700 time 2015-09-27 >>> 06:30:33.145251 >>> os/FileStore.cc: 2641: FAILED assert(allow_eio || !m_filestore_fail_eio >>> || got != -5) >>> >>> the second OSD crash was similar : >>> >>> 2015-09-27 06:30:57.373841 7f05d92cf700 -1 os/FileStore.cc: In function >>> 'virtual int FileStore::read(coll_t, const ghobject_t&, uint64_t, >>> size_t, ceph::bufferlist&, bool)' thread 7f05d92cf700 time 2015-09-27 >>> 06:30:57.260978 >>> os/FileStore.cc: 2641: FAILED assert(allow_eio || !m_filestore_fail_eio >>> || got != -5) >>> >>> I'm familiar with this error : it happened already with a BTRFS read >>> error (invalid csum) and I could correct it after flush-journal/deleting >>> the corrupted file/starting OSD/pg repair. >>> This time though there isn't any kernel log indicating an invalid csum. >>> The kernel is different though : we use 3.18.9 on these two servers and >>> the others had 4.0.5 so maybe BTRFS doesn't log invalid checksum errors >>> with this version. I've launched btrfs scrub on the 2 filesystems just >>> in case (still waiting for completion). >>> >>> The first attempt to restart these OSDs failed: one OSD died 19 seconds >>> after start, the other 21 seconds. Seeing that, I temporarily brought >>> down the min_size to 1 which allowed the 9 incomplete PG to recover. I >>> verified this by bringing min_size again to 2 and then restarted the 2 >>> OSDs. They didn't crash yet. >>> >>> For reference the assert failures were still the same when the OSD died >>> shortly after start : >>> 2015-09-27 08:20:19.332835 7f4467bd0700 -1 os/FileStore.cc: In function >>> 'virtual int FileStore::read(coll_t, const ghobject_t&, uint64_t, >>> size_t, ceph::bufferlist&, bool)' thread 7f4467bd0700 time 2015-09-27 >>> 08:20:19.325126 >>> os/FileStore.cc: 2641: FAILED assert(allow_eio || !m_filestore_fail_eio >>> || got != -5) >>> >>> 2015-09-27 08:20:50.626344 7f97f2d95700 -1 os/FileStore.cc: In function >>> 'virtual int FileStore::read(coll_t, const ghobject_t&, uint64_t, >>> size_t, ceph::bufferlist&, bool)' thread 7f97f2d95700 time 2015-09-27 >>> 08:20:50.605234 >>> os/FileStore.cc: 2641: FAILED assert(allow_eio || !m_filestore_fail_eio >>> || got != -5) >>> >>> Note that at 2015-09-27 06:30:11 a deep-scrub started on a PG involving >>> one (and only one) of these 2 OSD. As we evenly space deep-scrubs (with >>> currently a 10 minute interval), this might be relevant (or just a >>> coincidence). >>> >>> I made copies of the ceph osd logs (including the stack trace and the >>> recent events) if needed. >>> >>> Can anyone put some light on why these OSDs died ? >> I just had a thought. Could launching a defragmentation on a file in a >> BTRFS OSD filestore trigger this problem? > > That's not it : we had another crash a couple of hours ago on one of the > two servers involved in the first crashes and there was no concurrent > defragmentation going on. > > 2015-09-29 14:18:53.479881 7f8d78ff9700 -1 os/FileStore.cc: In function > 'virtual int FileStore::read(coll_t, const ghobject_t&, uint64_t, > size_t, ceph::bufferlist&, bool)' thread 7f8d78ff9700 time 2015-09-29 > 14:18:53.425790 > os/FileStore.cc: 2641: FAILED assert(allow_eio || !m_filestore_fail_eio > || got != -5) > > ceph version 0.80.9 (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047) > 1: (FileStore::read(coll_t, ghobject_t const&, unsigned long, unsigned > long, ceph::buffer::list&, bool)+0x96a) [0x8917ea] > 2: (ReplicatedBackend::objects_read_sync(hobject_t const&, unsigned > long, unsigned long, ceph::buffer::list*)+0x81) [0x90ecc1] > 3: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, > std::vector<OSDOp, std::allocator<OSDOp> >&)+0x6a81) [0x801091] > 4: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x63) > [0x809f23] > 5: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0xb6f) [0x80adbf] > 6: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x2ced) [0x815f4d] > 7: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, > ThreadPool::TPHandle&)+0x70c) [0x7b047c] > 8: (OSD::dequeue_op(boost::intrusive_ptr<PG>, > std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x34a) [0x60c74a] > 9: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, > ThreadPool::TPHandle&)+0x628) [0x628808] > 10: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, > std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >>::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x66ea8c] > 11: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0xa60416] > 12: (ThreadPool::WorkThread::entry()+0x10) [0xa62430] > 13: (()+0x8217) [0x7f8dae984217] > 14: (clone()+0x6d) [0x7f8dad129f8d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this. > > For the 2 previous crashes, I launched btrfs scrubs and it couldn't find > any problem. Could someone help diagnose what is going on? Is it a known > bug? > > Best regards, > > Lionel > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com