Hello everyone, I created a new ceph 14.2.7 Nautilus cluster recently. Cluster consists of 3 racks and 2 osd nodes on each rack, 12 new hdd in each node. HDD model is TOSHIBA MG07ACA14TE 14Tb. All data pools are ec pools. Yesterday I decided to increase pg number on one of the pools with command "ceph osd pool set photo.buckets.data pg_num 512", after that many osds started to crash with "out" and "down" status. I tried to increase recovery_sleep to 1s but osds still crashes. Osds started working properly only when i set "norecover" flag, but osd scrub errors appeared after that. In logs from osd during crashes i found this: --- Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHIN E_SIZE/huge/release/14.2.7/rpm/el7/BUILD/ceph-14.2.7/src/osd/ECBackend.cc: In function 'void ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*)' thread 7f8af535d700 time 2020-10-21 15:12:11.460092 Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHIN E_SIZE/huge/release/14.2.7/rpm/el7/BUILD/ceph-14.2.7/src/osd/ECBackend.cc: 648: FAILED ceph_assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset( aft er_progress.data_recovered_to - op.recovery_progress.data_recovered_to)) Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable) Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x55fc694d6c0f] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 2: (()+0x4dddd7) [0x55fc694d6dd7] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 3: (ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*)+0x1740) [0x55fc698cafa0] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 4: (ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::v14_2_0::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::v14_2_0::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, boost::optional<std::map<std::string, ceph::buffer::v14_2_0::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::v14_2_0::list> > > >, RecoveryMessages*)+0x734) [0x55fc698cb804] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 5: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x94) [0x55fc698ebbe4] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 6: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x8c) [0x55fc698bfdcc] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 7: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*, ZTracer::Trace const&)+0x109c) [0x55fc698d6b8c] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 8: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x17f) [0x55fc698d718f] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 9: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x4a) [0x55fc697c18ea] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 10: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x5b3) [0x55fc697676b3] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 11: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x362) [0x55fc695b3d72] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 12: (PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x62) [0x55fc698415c2] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 13: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x90f) [0x55fc695cebbf] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 14: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b6) [0x55fc69b6f976] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 15: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55fc69b72490] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 16: (()+0x7e65) [0x7f8b1ddede65] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: 17: (clone()+0x6d) [0x7f8b1ccb188d] Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: *** Caught signal (Aborted) ** Oct 21 15:12:11 ceph-osd-201 ceph-osd[58159]: in thread 7f8af535d700 thread_name:tp_osd_tp --- Current ec profile and pool info bellow: # ceph osd erasure-code-profile get EC42 crush-device-class=hdd crush-failure-domain=host crush-root=main jerasure-per-chunk-alignment=false k=4 m=2 plugin=jerasure technique=reed_sol_van w=8 pool 25 'photo.buckets.data' erasure size 6 min_size 4 crush_rule 6 object_hash rjenkins pg_num 512 pgp_num 280 pgp_num_target 512 autoscale_mode warn last_change 43418 lfor 0/0/42223 flags hashpspool stripe_width 1048576 application rgw Current ceph status: ceph -s cluster: id: 9ec8d309-a620-4ad8-93fa-c2d111e5256e health: HEALTH_ERR norecover flag(s) set 1 pools have many more objects per pg than average 4542629 scrub errors Possible data damage: 6 pgs inconsistent Degraded data redundancy: 1207268/578535561 objects degraded (0.209%), 51 pgs degraded, 35 pgs undersized 85 pgs not deep-scrubbed in time services: mon: 3 daemons, quorum ceph-osd-101,ceph-osd-201,ceph-osd-301 (age 2w) mgr: ceph-osd-101(active, since 3w), standbys: ceph-osd-301, ceph-osd-201 osd: 72 osds: 72 up (since 11h), 72 in (since 21h); 48 remapped pgs flags norecover rgw: 6 daemons active (ceph-osd-101.rgw0, ceph-osd-102.rgw0, ceph-osd-201.rgw0, ceph-osd-202.rgw0, ceph-osd-301.rgw0, ceph-osd-302.rgw0) data: pools: 26 pools, 15680 pgs objects: 96.46M objects, 124 TiB usage: 303 TiB used, 613 TiB / 917 TiB avail pgs: 1207268/578535561 objects degraded (0.209%) 14068769/578535561 objects misplaced (2.432%) 15290 active+clean 312 active+recovering 30 active+undersized+degraded+remapped+backfilling 21 active+recovering+degraded 13 active+remapped+backfilling 6 active+clean+inconsistent 5 active+recovering+undersized+remapped 3 active+clean+scrubbing+deep So now my cluster is stuck and can't recover properly, can someone give info about this problem? Is it a bug? _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx