Hello everyone,
I created a new ceph 14.2.7 Nautilus cluster recently. Cluster consists of
3 racks and 2 osd nodes on each rack, 12 new hdd in each node. HDD
model is TOSHIBA
MG07ACA14TE 14Tb. All data pools are ec pools.
Yesterday I decided to increase pg number on one of the pools with
command "ceph
osd pool set photo.buckets.data pg_num 512", after that many osds started
to crash with "out" and "down" status. I tried to increase recovery_sleep
to 1s but osds still crashes. Osds started working properly only when i set
"norecover" flag, but osd scrub errors appeared after that.
In logs from osd during crashes i found this:
---
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHIN
E_SIZE/huge/release/14.2.7/rpm/el7/BUILD/ceph-14.2.7/src/osd/ECBackend.cc:
In function 'void ECBackend::continue_recovery_op(ECBackend::RecoveryOp&,
RecoveryMessages*)'
thread 7f8af535d700 time 2020-10-21 15:12:11.460092
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHIN
E_SIZE/huge/release/14.2.7/rpm/el7/BUILD/ceph-14.2.7/src/osd/ECBackend.cc:
648: FAILED ceph_assert(pop.data.length() ==
sinfo.aligned_logical_offset_to_chunk_offset( aft
er_progress.data_recovered_to - op.recovery_progress.data_recovered_to))
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: ceph version 14.2.7
(3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 1:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x14a) [0x55fc694d6c0f]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 2: (()+0x4dddd7)
[0x55fc694d6dd7]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 3:
(ECBackend::continue_recovery_op(ECBackend::RecoveryOp&,
RecoveryMessages*)+0x1740) [0x55fc698cafa0]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 4:
(ECBackend::handle_recovery_read_complete(hobject_t const&,
boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t,
ceph::buffer::v14_2_0::list, std::less<pg_shard_t>,
std::allocator<std::pair<pg_shard_t const, ceph::buffer::v14_2_0::list> >
, boost::tuples::null_type, boost::tuples::null_type,
boost::tuples::null_type, boost::tuples::null_type,
boost::tuples::null_type, boost::tuples::null_type,
boost::tuples::null_type>&, boost::optional<std::map<std::string,
ceph::buffer::v14_2_0::list, std::less<std::string>,
std::allocator<std::pair<std::string const, ceph::buffer::v14_2_0::list> >
>, RecoveryMessages*)+0x734) [0x55fc698cb804]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 5:
(OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*,
ECBackend::read_result_t&>&)+0x94) [0x55fc698ebbe4]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 6:
(ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x8c)
[0x55fc698bfdcc]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 7:
(ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&,
RecoveryMessages*, ZTracer::Trace const&)+0x109c) [0x55fc698d6b8c]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 8:
(ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x17f)
[0x55fc698d718f]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 9:
(PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x4a)
[0x55fc697c18ea]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 10:
(PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
ThreadPool::TPHandle&)+0x5b3) [0x55fc697676b3]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 11:
(OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>,
ThreadPool::TPHandle&)+0x362) [0x55fc695b3d72]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 12: (PGOpItem::run(OSD*,
OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x62)
[0x55fc698415c2]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 13:
(OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x90f)
[0x55fc695cebbf]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 14:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b6)
[0x55fc69b6f976]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 15:
(ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55fc69b72490]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 16: (()+0x7e65)
[0x7f8b1ddede65]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 17: (clone()+0x6d)
[0x7f8b1ccb188d]
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: *** Caught signal (Aborted) **
Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: in thread 7f8af535d700
thread_name:tp_osd_tp
---
Current ec profile and pool info bellow:
# ceph osd erasure-code-profile get EC42
crush-device-class=hdd
crush-failure-domain=host
crush-root=main
jerasure-per-chunk-alignment=false
k=4
m=2
plugin=jerasure
technique=reed_sol_van
w=8
pool 25 'photo.buckets.data' erasure size 6 min_size 4 crush_rule 6
object_hash rjenkins pg_num 512 pgp_num 280 pgp_num_target 512
autoscale_mode warn last_change 43418 lfor 0/0/42223 flags hashpspool
stripe_width 1048576 application rgw
Current ceph status:
ceph -s
cluster:
id: 9ec8d309-a620-4ad8-93fa-c2d111e5256e
health: HEALTH_ERR
norecover flag(s) set
1 pools have many more objects per pg than average
4542629 scrub errors
Possible data damage: 6 pgs inconsistent
Degraded data redundancy: 1207268/578535561 objects degraded
(0.209%), 51 pgs degraded, 35 pgs undersized
85 pgs not deep-scrubbed in time
services:
mon: 3 daemons, quorum ceph-osd-101,ceph-osd-201,ceph-osd-301 (age 2w)
mgr: ceph-osd-101(active, since 3w), standbys: ceph-osd-301,
ceph-osd-201
osd: 72 osds: 72 up (since 11h), 72 in (since 21h); 48 remapped pgs
flags norecover
rgw: 6 daemons active (ceph-osd-101.rgw0, ceph-osd-102.rgw0,
ceph-osd-201.rgw0, ceph-osd-202.rgw0, ceph-osd-301.rgw0, ceph-osd-302.rgw0)
data:
pools: 26 pools, 15680 pgs
objects: 96.46M objects, 124 TiB
usage: 303 TiB used, 613 TiB / 917 TiB avail
pgs: 1207268/578535561 objects degraded (0.209%)
14068769/578535561 objects misplaced (2.432%)
15290 active+clean
312 active+recovering
30 active+undersized+degraded+remapped+backfilling
21 active+recovering+degraded
13 active+remapped+backfilling
6 active+clean+inconsistent
5 active+recovering+undersized+remapped
3 active+clean+scrubbing+deep
So now my cluster is stuck and can't recover properly, can someone give
info about this problem? Is it a bug?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx