Hi,
did you make any progress with this? I can't really help with the
stack trace, I'm happy that we could successfully decommission our
cache tier last week (although it served us very well for almost nine
years or so).
You write that those cache tier OSDs are used for both cache tier and
data pools. Maybe you can split that by moving the cache tier to
different OSDs so there's no mixed use? I'm not very optimistic that
this would mitigate the issue, we had such a setup for months during
the transition to eventually remove the cache tier. But we had to
switch off all VMs in order to safely get rid of the cache tier
because it wouldn't let us flush the remaining header objects. But now
we're finally in a state where we can plan the next upgrade.
Regards,
Eugen
Zitat von Nikola Ciprich <nikola.ciprich@xxxxxxxxxxx>:
Hello dear ceph users and developers,
today, we've hit issue on one of our legacy clusters running 14.2.22.
we've apparently lost two objects containing cache tier hit_set history.
when hit_set_trim hits missing object, it causes OSD to crash:
ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351)
nautilus (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x141) [0x55714332faad]
2: (()+0x4e8ca8) [0x55714332fca8]
3:
(PrimaryLogPG::hit_set_trim(std::unique_ptr<PrimaryLogPG::OpContext,
std::default_delete<PrimaryLogPG::OpContext> >&, unsigned
int)+0xcd6) [0x557143626136]
4: (PrimaryLogPG::hit_set_remove_all()+0x2d5) [0x5571436264b5]
5: (PrimaryLogPG::on_activate()+0x55a) [0x5571436360aa]
6: (PG::RecoveryState::Active::react(PG::AllReplicasActivated
const&)+0x130) [0x557143512b50]
7: (boost::statechart::simple_state<PG::RecoveryState::Active,
PG::RecoveryState::Primary, PG::RecoveryState::Activating,
(boost::statechart::history_mode)0>::react_impl(boost::statechart::
8: (boost::statechart::simple_state<PG::RecoveryState::Activating,
PG::RecoveryState::Active, boost::mpl::list<mpl_::na, mpl_::na,
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na
9: (PG::do_peering_event(std::shared_ptr<PGPeeringEvent>,
PG::RecoveryCtx*)+0x15d) [0x55714353c7ed]
10: (OSD::dequeue_peering_evt(OSDShard*, PG*,
std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x211)
[0x5571434555d1]
11: (PGPeeringItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&,
ThreadPool::TPHandle&)+0x52) [0x557143723352]
12: (OSD::ShardedOpWQ::_process(unsigned int,
ceph::heartbeat_handle_d*)+0x582) [0x557143443122]
13: (ShardedThreadPool::shardedthreadpool_worker(unsigned
int)+0x3eb) [0x557143ac2f4b]
14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x557143ac5d30]
15: (()+0x7ea5) [0x7f22383e5ea5]
16: (clone()+0x6d) [0x7f22372a8b0d]
I've found old report of similar problem here:
https://www.spinics.net/lists/ceph-users/msg48660.html
however using the same patch:
diff -Naur ceph-14.2.22/src/osd/PrimaryLogPG.cc
ceph-14.2.22-fix-hit_set_trim/src/osd/PrimaryLogPG.cc
--- ceph-14.2.22/src/osd/PrimaryLogPG.cc 2024-12-29
12:34:17.000000000 +0100
+++ ceph-14.2.22-fix-hit_set_trim/src/osd/PrimaryLogPG.cc
2024-12-29 19:42:31.527632776 +0100
@@ -13932,11 +13932,13 @@
updated_hit_set_hist.history.pop_front();
ObjectContextRef obc = get_object_context(oid, false);
- ceph_assert(obc);
- --ctx->delta_stats.num_objects;
- --ctx->delta_stats.num_objects_hit_set_archive;
- ctx->delta_stats.num_bytes -= obc->obs.oi.size;
- ctx->delta_stats.num_bytes_hit_set_archive -= obc->obs.oi.size;
+ //ceph_assert(obc);
+ if(obc){
+ --ctx->delta_stats.num_objects;
+ --ctx->delta_stats.num_objects_hit_set_archive;
+ ctx->delta_stats.num_bytes -= obc->obs.oi.size;
+ ctx->delta_stats.num_bytes_hit_set_archive -= obc->obs.oi.size;
+ };
}
}
causes OSD to crash later:
/usr/src/redhat/BUILD/ceph-14.2.22/src/osd/PrimaryLogPG.cc: In
function 'virtual void PrimaryLogPG::op_applied(const eversion_t&)'
thread 7f91e146d700 time 2024-12-29 19:58:13.880029
/usr/src/redhat/BUILD/ceph-14.2.22/src/osd/PrimaryLogPG.cc: 10457:
FAILED ceph_assert(applied_version <= info.last_update)
/usr/src/redhat/BUILD/ceph-14.2.22/src/osd/PrimaryLogPG.cc: In
function 'virtual void PrimaryLogPG::op_applied(const eversion_t&)'
thread 7f91df469700 time 2024-12-29 19:58:13.882629
/usr/src/redhat/BUILD/ceph-14.2.22/src/osd/PrimaryLogPG.cc: 10457:
FAILED ceph_assert(applied_version <= info.last_update)
ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351)
nautilus (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x122) [0x55e07e95b815]
2: (()+0x49899a) [0x55e07e95b99a]
3: (PrimaryLogPG::op_applied(eversion_t const&)+0x1f2) [0x55e07eb9fef2]
4: (ReplicatedBackend::submit_transaction(hobject_t const&,
object_stat_sum_t const&, eversion_t const&,
std::unique_ptr<PGTransaction, std::default_delete<PGTransaction>
>&&, eversion_t const&, eversion_t const&,
std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> > const&,
boost::optional<pg_hit_set_history_t>&, Context*, unsigned long,
osd_reqid_t, boost::intrusive_ptr<OpRequest>)+0x720) [0x55e07ed411b0]
5: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*,
PrimaryLogPG::OpContext*)+0xdc5) [0x55e07eba0cd5]
6:
(PrimaryLogPG::simple_opc_submit(std::unique_ptr<PrimaryLogPG::OpContext,
std::default_delete<PrimaryLogPG::OpContext> >)+0x9b) [0x55e07eba2acb]
7: (PrimaryLogPG::hit_set_remove_all()+0x2e5) [0x55e07ebdcbf5]
8: (PrimaryLogPG::on_pool_change()+0xeb) [0x55e07ebddc7b]
9: (PG::handle_advance_map(std::shared_ptr<OSDMap const>,
std::shared_ptr<OSDMap const>, std::vector<int, std::allocator<int>
>&, int, std::vector<int, std::allocator<int> >&, int,
PG::RecoveryCtx*)+0x34f) [0x55e07eaf8c4f]
10: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&,
PG::RecoveryCtx*)+0x2f6) [0x55e07ea482e6]
11: (OSD::dequeue_peering_evt(OSDShard*, PG*,
std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x1b4)
[0x55e07ea50a24]
12: (PGPeeringItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&,
ThreadPool::TPHandle&)+0x4e) [0x55e07ecde60e]
13: (OSD::ShardedOpWQ::_process(unsigned int,
ceph::heartbeat_handle_d*)+0xfea) [0x55e07ea5527a]
14: (ShardedThreadPool::shardedthreadpool_worker(unsigned
int)+0x415) [0x55e07f032345]
15: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55e07f034740]
16: (()+0x7ea5) [0x7f9203024ea5]
17: (clone()+0x6d) [0x7f9201ee7b0d]
unfortunately here I don't have any clue how to proceed further.
I've increased hit_set_count to 32 and hit_set_period to 36000 thus
hopefully gaining some time (now OSD seem to be running).
Any ideas on how to safely get from this mess? I can't easily get
rid of cache tier now
since it's used by running VMs and what is worse, I'm not sure I
won't hit same problem
when deleting cache pool anyways - OSDs are shared by cache pool and
NVME data pool so it's pretty
uncomfortable situation :-(
I'm using bluestore for all OSDs, so can't even try copying other
hit_set within filestore..
I'll be very grateful for any help
with best regards
nikola ciprich
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx