OSD processes crashes on repair 'unexpected clone'

Thomas Björklund <thomas@xxxxxxxxx> · Tue, 28 May 2024 11:52:43 +0200

Hi,

We have an old cluster with 3 nodes running ceph version 15.2.17.
We have a PG in state active+clean+inconsistent which we are unable to
repair.

It's an RBD pool in use by kubernetes.

The earliest indication of the issue comes from ceph-osd.4.log on one of
the nodes:

*2024-05-23T14:34:38.003+0200 7f00b907c700  0 log_channel(cluster) log
[DBG] : 20.4 repair starts2024-05-23T14:34:46.095+0200 7f00b907c700 -1
log_channel(cluster) log [ERR] : 20.4 shard 4
20:20a41d01:stc::rbd_data.44feb146358046.0000000000007737:38e :
missing2024-05-23T14:34:46.095+0200 7f00b907c700 -1 log_channel(cluster)
log [ERR] : 20.4 shard 16
20:20a41d01:stc::rbd_data.44feb146358046.0000000000007737:38e :
missing2024-05-23T14:34:46.095+0200 7f00b907c700 -1 log_channel(cluster)
log [ERR] : repair 20.4
20:20a41d01:stc::rbd_data.44feb146358046.0000000000007737:38e : is an
unexpected clone2024-05-23T14:34:46.095+0200 7f00b907c700 -1 osd.4
pg_epoch: 30768 pg[20.4( v 30768'359661769
(30767'359657067,30768'359661769] local-lis/les=30660/30665 n=10628
ec=28562/20971 lis/c=30660/30660 les/c/f=30665/30665/0 sis=30660) [4,10,16]
r=0 lpr=30660 luod=30768'359661760 crt=30768'359661769 lcod 30768'359661759
mlcod 30768'359661759 active+clean+scrubbing+deep+inconsistent+repair
REQ_SCRUB] _scan_snaps no head for
20:20a41d01:stc::rbd_data.44feb146358046.0000000000007737:38e (have
20:20a42186:stc::rbd_data.44feb146358046.00000000000425b6:head)2024-05-23T14:35:27.712+0200
7f00b5074700 -1 log_channel(cluster) log [ERR] : 20.4 repair 1 missing, 0
inconsistent objects2024-05-23T14:35:27.712+0200 7f00b5074700 -1
log_channel(cluster) log [ERR] : 20.4 repair 3 errors, 2
fixed2024-05-23T14:35:27.720+0200 7f00b5074700 -1
./src/osd/PrimaryLogPG.cc: In function 'int
PrimaryLogPG::recover_missing(const hobject_t&, eversion_t, int,
PGBackend::RecoveryHandle*)' thread 7f00b5074700 time
2024-05-23T14:35:27.718485+0200./src/osd/PrimaryLogPG.cc: 11540: FAILED
ceph_assert(head_obc) ceph version 15.2.17
(542df8d06ef24dbddcf4994db16bcc4c89c9ec2d) octopus (stable) 1:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x131) [0x55a740013abb] 2: (()+0x9d5c46) [0x55a740013c46] 3:
(PrimaryLogPG::recover_missing(hobject_t const&, eversion_t, int,
PGBackend::RecoveryHandle*)+0x6cf) [0x55a7401fabbf] 4:
(PrimaryLogPG::recover_primary(unsigned long, ThreadPool::TPHandle&)+0xf7e)
[0x55a74023b16e] 5: (PrimaryLogPG::start_recovery_ops(unsigned long,
ThreadPool::TPHandle&, unsigned long*)+0x5ee) [0x55a740246b4e] 6:
(OSD::do_recovery(PG*, unsigned int, unsigned long,
ThreadPool::TPHandle&)+0x295) [0x55a7400c4c85] 7:
(ceph::osd::scheduler::PGRecovery::run(OSD*, OSDShard*,
boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x19) [0x55a74031c229] 8:
(OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xf13)
[0x55a7400e13c3] 9: (ShardedThreadPool::shardedthreadpool_worker(unsigned
int)+0x41a) [0x55a74072ae6a] 10:
(ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55a74072d410] 11:
(()+0x7ea7) [0x7f00d59e2ea7] 12: (clone()+0x3f)
[0x7f00d556ba6f]2024-05-23T14:35:27.724+0200 7f00b5074700 -1 *** Caught
signal (Aborted) ** in thread 7f00b5074700 thread_name:tp_osd_tp ceph
version 15.2.17 (542df8d06ef24dbddcf4994db16bcc4c89c9ec2d) octopus
(stable) 1: (()+0x13140) [0x7f00d59ee140] 2: (gsignal()+0x141)
[0x7f00d54a8ce1] 3: (abort()+0x123) [0x7f00d5492537] 4:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x17b) [0x55a740013b05] 5: (()+0x9d5c46) [0x55a740013c46] 6:
(PrimaryLogPG::recover_missing(hobject_t const&, eversion_t, int,
PGBackend::RecoveryHandle*)+0x6cf) [0x55a7401fabbf] 7:
(PrimaryLogPG::recover_primary(unsigned long, ThreadPool::TPHandle&)+0xf7e)
[0x55a74023b16e] 8: (PrimaryLogPG::start_recovery_ops(unsigned long,
ThreadPool::TPHandle&, unsigned long*)+0x5ee) [0x55a740246b4e] 9:
(OSD::do_recovery(PG*, unsigned int, unsigned long,
ThreadPool::TPHandle&)+0x295) [0x55a7400c4c85] 10:
(ceph::osd::scheduler::PGRecovery::run(OSD*, OSDShard*,
boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x19)
[0x55a74031c229] 11: (OSD::ShardedOpWQ::_process(unsigned int,
ceph::heartbeat_handle_d*)+0xf13) [0x55a7400e13c3] 12:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x41a)
[0x55a74072ae6a] 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
[0x55a74072d410] 14: (()+0x7ea7) [0x7f00d59e2ea7] 15: (clone()+0x3f)
[0x7f00d556ba6f] NOTE: a copy of the executable, or `objdump -rdS
<executable>` is needed to interpret this.*

Every time we try to start a repair on 20.4 all three OSD crashes (4, 10 &
16)

*2024-05-28T11:14:30.573+0200 7f4d3405a700  0 log_channel(cluster) log
[DBG] : 20.4 repair starts2024-05-28T11:14:39.481+0200 7f4d30052700 -1
log_channel(cluster) log [ERR] : 20.4 shard 10 soid
20:20a41d01:stc::rbd_data.44feb146358046.0000000000007737:38e : candidate
had a read error2024-05-28T11:14:39.481+0200 7f4d30052700 -1
log_channel(cluster) log [ERR] : 20.4 soid
20:20a41d01:stc::rbd_data.44feb146358046.0000000000007737:38e : repairing
object info data_digest2024-05-28T11:14:39.481+0200 7f4d30052700 -1
log_channel(cluster) log [ERR] : 20.4 soid
20:20a41d01:stc::rbd_data.44feb146358046.0000000000007737:38e : data_digest
0x43d61c5d != data_digest 0x867fac72 from auth oi
20:20a41d01:stc::rbd_data.44feb146358046.0000000000007737:38e(29034'359526115
osd.4.0:147412411 dirty|data_digest|omap_digest s 4194304 uv 287854203 dd
867fac72 od ffffffff alloc_hint [0 0 0])2024-05-28T11:14:39.481+0200
7f4d30052700 -1 log_channel(cluster) log [ERR] : 20.4 shard 16
20:20a41d01:stc::rbd_data.44feb146358046.0000000000007737:38e :
missing2024-05-28T11:14:39.481+0200 7f4d30052700 -1 log_channel(cluster)
log [ERR] : repair 20.4
20:20a41d01:stc::rbd_data.44feb146358046.0000000000007737:38e : is an
unexpected clone2024-05-28T11:14:39.489+0200 7f4d30052700 -1
./src/osd/PG.cc: In function 'void PG::update_snap_map(const
std::vector<pg_log_entry_t>&, ObjectStore::Transaction&)' thread
7f4d30052700 time 2024-05-28T11:14:39.487768+0200./src/osd/PG.cc: 1225:
FAILED ceph_assert(r == 0) ceph version 15.2.17
(542df8d06ef24dbddcf4994db16bcc4c89c9ec2d) octopus (stable) 1:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x131) [0x55c8140ebabb] 2: (()+0x9d5c46) [0x55c8140ebc46] 3:
(PG::update_snap_map(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&, ceph::os::Transaction&)+0x6f8)
[0x55c814241d28] 4:
(PrimaryLogPG::log_operation(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
std::optional<pg_hit_set_history_t> const&, eversion_t const&, eversion_t
const&, eversion_t const&, bool, ceph::os::Transaction&, bool)+0x230)
[0x55c814358510] 5: (ReplicatedBackend::submit_transaction(hobject_t
const&, object_stat_sum_t const&, eversion_t const&,
std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&,
eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
std::optional<pg_hit_set_history_t>&, Context*, unsigned long, osd_reqid_t,
boost::intrusive_ptr<OpRequest>)+0x6cd) [0x55c8144df8ad] 6:
(PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*,
PrimaryLogPG::OpContext*)+0x53d) [0x55c8142b250d] 7:
(PrimaryLogPG::simple_opc_submit(std::unique_ptr<PrimaryLogPG::OpContext,
std::default_delete<PrimaryLogPG::OpContext> >)+0x9b) [0x55c8142b411b] 8:
(PrimaryLogPG::scrub_snapshot_metadata(ScrubMap&, std::map<hobject_t,
std::pair<std::optional<unsigned int>, std::optional<unsigned int> >,
std::less<hobject_t>, std::allocator<std::pair<hobject_t const,
std::pair<std::optional<unsigned int>, std::optional<unsigned int> > > > >
const&)+0x24b1) [0x55c8142dc0e1] 9: (PG::scrub_compare_maps()+0x18db)
[0x55c81424beeb] 10: (PG::chunky_scrub(ThreadPool::TPHandle&)+0x14a)
[0x55c81424eb1a] 11: (ceph::osd::scheduler::PGScrub::run(OSD*, OSDShard*,
boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x1a)
[0x55c8143f41fa] 12: (OSD::ShardedOpWQ::_process(unsigned int,
ceph::heartbeat_handle_d*)+0xf13) [0x55c8141b93c3] 13:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x41a)
[0x55c814802e6a] 14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
[0x55c814805410] 15: (()+0x7ea7) [0x7f4d509c0ea7] 16: (clone()+0x3f)
[0x7f4d50549a6f]2024-05-28T11:14:39.497+0200 7f4d30052700 -1 *** Caught
signal (Aborted) ** in thread 7f4d30052700 thread_name:tp_osd_tp ceph
version 15.2.17 (542df8d06ef24dbddcf4994db16bcc4c89c9ec2d) octopus
(stable) 1: (()+0x13140) [0x7f4d509cc140] 2: (gsignal()+0x141)
[0x7f4d50486ce1] 3: (abort()+0x123) [0x7f4d50470537] 4:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x17b) [0x55c8140ebb05] 5: (()+0x9d5c46) [0x55c8140ebc46] 6:
(PG::update_snap_map(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&, ceph::os::Transaction&)+0x6f8)
[0x55c814241d28] 7:
(PrimaryLogPG::log_operation(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
std::optional<pg_hit_set_history_t> const&, eversion_t const&, eversion_t
const&, eversion_t const&, bool, ceph::os::Transaction&, bool)+0x230)
[0x55c814358510] 8: (ReplicatedBackend::submit_transaction(hobject_t
const&, object_stat_sum_t const&, eversion_t const&,
std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&,
eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
std::optional<pg_hit_set_history_t>&, Context*, unsigned long, osd_reqid_t,
boost::intrusive_ptr<OpRequest>)+0x6cd) [0x55c8144df8ad] 9:
(PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*,
PrimaryLogPG::OpContext*)+0x53d) [0x55c8142b250d] 10:
(PrimaryLogPG::simple_opc_submit(std::unique_ptr<PrimaryLogPG::OpContext,
std::default_delete<PrimaryLogPG::OpContext> >)+0x9b) [0x55c8142b411b] 11:
(PrimaryLogPG::scrub_snapshot_metadata(ScrubMap&, std::map<hobject_t,
std::pair<std::optional<unsigned int>, std::optional<unsigned int> >,
std::less<hobject_t>, std::allocator<std::pair<hobject_t const,
std::pair<std::optional<unsigned int>, std::optional<unsigned int> > > > >
const&)+0x24b1) [0x55c8142dc0e1] 12: (PG::scrub_compare_maps()+0x18db)
[0x55c81424beeb] 13: (PG::chunky_scrub(ThreadPool::TPHandle&)+0x14a)
[0x55c81424eb1a] 14: (ceph::osd::scheduler::PGScrub::run(OSD*, OSDShard*,
boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x1a)
[0x55c8143f41fa] 15: (OSD::ShardedOpWQ::_process(unsigned int,
ceph::heartbeat_handle_d*)+0xf13) [0x55c8141b93c3] 16:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x41a)
[0x55c814802e6a] 17: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
[0x55c814805410] 18: (()+0x7ea7) [0x7f4d509c0ea7] 19: (clone()+0x3f)
[0x7f4d50549a6f] NOTE: a copy of the executable, or `objdump -rdS
<executable>` is needed to interpret this.*

We are unsure on how to rectify the issue and get the cluster healthy again.

Best regards,
Thomas Björklund
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx