We'll get https://github.com/ceph/ceph/pull/32000 out in 13.2.8 as quickly as possible. Neha On Wed, Dec 4, 2019 at 6:56 AM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > > My advice is to wait. > > We built a 13.2.7 + https://github.com/ceph/ceph/pull/26448 cherry > picked and the OSDs no longer crash. > > My vote would be for a quick 13.2.8. > > -- Dan > > On Wed, Dec 4, 2019 at 2:41 PM Frank Schilder <frans@xxxxxx> wrote: > > > > Is this issue now a no-go for updating to 13.2.7 or are there only some specific unsafe scenarios? > > > > Best regards, > > > > ================= > > Frank Schilder > > AIT Risø Campus > > Bygning 109, rum S14 > > > > ________________________________________ > > From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of Dan van der Ster <dan@xxxxxxxxxxxxxx> > > Sent: 03 December 2019 16:42:45 > > To: ceph-users > > Subject: Re: v13.2.7 osds crash in build_incremental_map_msg > > > > I created https://tracker.ceph.com/issues/43106 and we're downgrading > > our osds back to 13.2.6. > > > > -- dan > > > > On Tue, Dec 3, 2019 at 4:09 PM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > > > > > > Hi all, > > > > > > We're midway through an update from 13.2.6 to 13.2.7 and started > > > getting OSDs crashing regularly like this [1]. > > > Does anyone obviously know what the issue is? (Maybe > > > https://github.com/ceph/ceph/pull/26448/files ?) > > > Or is it some temporary problem while we still have v13.2.6 and > > > v13.2.7 osds running concurrently? > > > > > > Thanks! > > > > > > Dan > > > > > > [1] > > > > > > 2019-12-03 15:53:51.817 7ff3a3d39700 -1 osd.1384 2758889 > > > build_incremental_map_msg missing incremental map 2758889 > > > 2019-12-03 15:53:51.817 7ff3a453a700 -1 osd.1384 2758889 > > > build_incremental_map_msg missing incremental map 2758889 > > > 2019-12-03 15:53:51.817 7ff3a453a700 -1 osd.1384 2758889 > > > build_incremental_map_msg unable to load latest map 2758889 > > > 2019-12-03 15:53:51.822 7ff3a453a700 -1 *** Caught signal (Aborted) ** > > > in thread 7ff3a453a700 thread_name:tp_osd_tp > > > > > > ceph version 13.2.7 (71bd687b6e8b9424dd5e5974ed542595d8977416) mimic (stable) > > > 1: (()+0xf5f0) [0x7ff3c620b5f0] > > > 2: (gsignal()+0x37) [0x7ff3c522b337] > > > 3: (abort()+0x148) [0x7ff3c522ca28] > > > 4: (OSDService::build_incremental_map_msg(unsigned int, unsigned int, > > > OSDSuperblock&)+0x767) [0x555d60e8d797] > > > 5: (OSDService::send_incremental_map(unsigned int, Connection*, > > > std::shared_ptr<OSDMap const>&)+0x39e) [0x555d60e8dbee] > > > 6: (OSDService::share_map_peer(int, Connection*, > > > std::shared_ptr<OSDMap const>)+0x159) [0x555d60e8eda9] > > > 7: (OSDService::send_message_osd_cluster(int, Message*, unsigned > > > int)+0x1a5) [0x555d60e8f085] > > > 8: (ReplicatedBackend::issue_op(hobject_t const&, eversion_t const&, > > > unsigned long, osd_reqid_t, eversion_t, eversion_t, hobject_t, > > > hobject_t, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> > > > > const&, boost::optional<pg_hit_set_history_t>&, > > > ReplicatedBackend::InProgressOp*, ObjectStore::Transaction&)+0x452) > > > [0x555d6116e522] > > > 9: (ReplicatedBackend::submit_transaction(hobject_t const&, > > > object_stat_sum_t const&, eversion_t const&, > > > std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&, > > > eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t, > > > std::allocator<pg_log_entry_t> > const&, > > > boost::optional<pg_hit_set_history_t>&, Context*, unsigned long, > > > osd_reqid_t, boost::intrusive_ptr<OpRequest>)+0x6f5) [0x555d6117ed85] > > > 10: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, > > > PrimaryLogPG::OpContext*)+0xd62) [0x555d60ff5142] > > > 11: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0xf12) > > > [0x555d61035902] > > > 12: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x3679) > > > [0x555d610397a9] > > > 13: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, > > > ThreadPool::TPHandle&)+0xc99) [0x555d6103d869] > > > 14: (OSD::dequeue_op(boost::intrusive_ptr<PG>, > > > boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1b7) > > > [0x555d60e8e8a7] > > > 15: (PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, > > > ThreadPool::TPHandle&)+0x62) [0x555d611144c2] > > > 16: (OSD::ShardedOpWQ::_process(unsigned int, > > > ceph::heartbeat_handle_d*)+0x592) [0x555d60eb25f2] > > > 17: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x3d3) > > > [0x7ff3c929f5b3] > > > 18: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7ff3c92a01a0] > > > 19: (()+0x7e65) [0x7ff3c6203e65] > > > 20: (clone()+0x6d) [0x7ff3c52f388d] > > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > > > needed to interpret this. > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com