For the record, in the linked issue, it was thought that this might be due to write caching. This seems not to be the case, as it happened again to me with write caching disabled. On Tue, Jan 8, 2019 at 11:15 AM Sage Weil <sage@xxxxxxxxxxxx> wrote: > > I've seen this on luminous, but not on mimic. Can you generate a log with > debug osd = 20 leading up to the crash? > > Thanks! > sage > > > On Tue, 8 Jan 2019, Paul Emmerich wrote: > > > I've seen this before a few times but unfortunately there doesn't seem > > to be a good solution at the moment :( > > > > See also: http://tracker.ceph.com/issues/23145 > > > > Paul > > > > -- > > Paul Emmerich > > > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > > > croit GmbH > > Freseniusstr. 31h > > 81247 München > > www.croit.io > > Tel: +49 89 1896585 90 > > > > On Tue, Jan 8, 2019 at 9:37 AM David Young <funkypenguin@xxxxxxxxxxxxxx> wrote: > > > > > > Hi all, > > > > > > One of my OSD hosts recently ran into RAM contention (was swapping heavily), and after rebooting, I'm seeing this error on random OSDs in the cluster: > > > > > > --- > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable) > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 1: /usr/bin/ceph-osd() [0xcac700] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 2: (()+0x11390) [0x7f8fa5d0e390] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 3: (gsignal()+0x38) [0x7f8fa5241428] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 4: (abort()+0x16a) [0x7f8fa524302a] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x250) [0x7f8fa767c510] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 6: (()+0x2e5587) [0x7f8fa767c587] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x923) [0xbab5e3] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 8: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x5c3) [0xbade03] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 9: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x82) [0x79c812] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 10: (OSD::dispatch_context_transaction(PG::RecoveryCtx&, PG*, ThreadPool::TPHandle*)+0x58) [0x730ff8] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 11: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0xfe) [0x759aae] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 12: (PGPeeringItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x50) [0x9c5720] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 13: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x590) [0x769760] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 14: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x476) [0x7f8fa76824f6] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 15: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7f8fa76836b0] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 16: (()+0x76ba) [0x7f8fa5d046ba] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: 17: (clone()+0x6d) [0x7f8fa531341d] > > > Jan 08 03:34:36 prod1 ceph-osd[3357939]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. > > > Jan 08 03:34:36 prod1 systemd[1]: ceph-osd@43.service: Main process exited, code=killed, status=6/ABRT > > > --- > > > > > > I've restarted all the OSDs and the mons, but still encountering the above. > > > > > > Any ideas / suggestions? > > > > > > Thanks! > > > D > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com