Hi Pavin, The following are additional developments.. There's one PG that's stuck and unable to recover. I've attached relevant ceph -s / health detail and pg stat outputs below. - There were some remaining lock files as suggested in /var/run/ceph/ pertaining to rgw. I removed the service, deleted any stale lock files and redeployed the RGWs. All started with the common log entries across all: 7ff5d9aaf5c0 0 deferred set uid:gid to 167:167 (ceph:ceph) 7ff5d9aaf5c0 0 ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable), process radosgw, pid 2 7ff5d9aaf5c0 0 framework: beast 7ff5d9aaf5c0 0 framework conf key: port, val: 80 7ff5d9aaf5c0 1 radosgw_Main not setting numa affinity 7ff5d9aaf5c0 1 rgw_d3n: rgw_d3n_l1_local_datacache_enabled=0 7ff5d9aaf5c0 1 D3N datacache enabled: 0 No additional log entries are recorded since starting them post re-deployment as per above. The cluster settled, there is no recovery activity. There is one pg that's stuck and I have a hunch that it's impacting MDS and RGW processes as stated in the thread. PG is stuck as as active+remapped+backfilling: data: volumes: 2/2 healthy pools: 16 pools, 1504 pgs objects: 24.49M objects, 79 TiB usage: 119 TiB used, 390 TiB / 508 TiB avail pgs: 65210/146755179 objects misplaced (0.044%) 1503 active+clean 1 active+remapped+backfilling progress: Global Recovery Event (6h) [===========================.] (remaining: 73s) # ceph health detail HEALTH_WARN 1 MDSs report slow metadata IOs; 1 pgs not deep-scrubbed in time; 1 pgs not scrubbed in time [WRN] MDS_SLOW_METADATA_IO: 1 MDSs report slow metadata IOs mds.fs01.ceph02mon02.wicrdz(mds.0): 5 slow metadata IOs are blocked > 30 secs, oldest blocked for 74436 secs [WRN] PG_NOT_DEEP_SCRUBBED: 1 pgs not deep-scrubbed in time pg 14.ff not deep-scrubbed since 2022-12-14T19:35:51.893008+0000 [WRN] PG_NOT_SCRUBBED: 1 pgs not scrubbed in time pg 14.ff not scrubbed since 2022-12-17T06:33:40.577932+0000 >From the following pg query: - "pgid": "14.ffs0" is stuck as peering (osd 5) - "pgid": "14.ffs4" is stuck as unknown (osd 18) - "pgid": "14.ffs5" is stuck as unknown (osd 24) - "pgid": "14.ffs3" is stuck as unknown (osd 42) - "pgid": "14.ffs2" is stick as unknown (osd 58) - "pgid": "14.ffs1" is marked as active+clean (osd 36) # ceph pg 14.ff query { "snap_trimq": "[]", "snap_trimq_len": 0, "state": "active+remapped+backfilling", "epoch": 19594, "up": [ 5, 36, 58, 42, 18, 24 ], "acting": [ 50, 36, 5, 26, 15, 46 ], "backfill_targets": [ "5(0)", "18(4)", "24(5)", "42(3)", "58(2)" ], "acting_recovery_backfill": [ "5(0)", "5(2)", "15(4)", "18(4)", "24(5)", "26(3)", "36(1)", "42(3)", "46(5)", "50(0)", "58(2)" ], "info": { "pgid": "14.ffs0", "last_update": "19550'35077", "last_complete": "19550'35077", "log_tail": "13761'32157", "last_user_version": 35077, "last_backfill": "MAX", "purged_snaps": [], "history": { "epoch_created": 4537, "epoch_pool_created": 2032, "last_epoch_started": 16616, "last_interval_started": 16615, "last_epoch_clean": 14655, "last_interval_clean": 14654, "last_epoch_split": 4537, "last_epoch_marked_full": 0, "same_up_since": 16613, "same_interval_since": 16615, "same_primary_since": 16615, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "prior_readable_until_ub": 0 }, "stats": { "version": "19550'35077", "reported_seq": 396919, "reported_epoch": 19594, "state": "active+remapped+backfilling", "last_fresh": "2022-12-28T22:03:20.278478+0000", "last_change": "2022-12-26T21:27:51.600940+0000", "last_active": "2022-12-28T22:03:20.278478+0000", "last_peered": "2022-12-28T22:03:20.278478+0000", "last_clean": "2022-12-26T21:27:45.471954+0000", "last_became_active": "2022-12-26T21:27:51.085966+0000", "last_became_peered": "2022-12-26T21:27:51.085966+0000", "last_unstale": "2022-12-28T22:03:20.278478+0000", "last_undegraded": "2022-12-28T22:03:20.278478+0000", "last_fullsized": "2022-12-28T22:03:20.278478+0000", "mapping_epoch": 16615, "log_start": "13761'32157", "ondisk_log_start": "13761'32157", "created": 4537, "last_epoch_clean": 14655, "parent": "0.0", "parent_split_bits": 8, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "objects_scrubbed": 16227, "log_size": 2920, "ondisk_log_size": 2920, "stats_invalid": true, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, "snaptrimq_len": 0, "last_scrub_duration": 14, "scrub_schedule": "queued for deep scrub", "scrub_duration": 13.320415128, "objects_trimmed": 0, "snaptrim_duration": 0, "stat_sum": { "num_bytes": 56709530650, "num_objects": 13548, "num_object_clones": 0, "num_object_copies": 81288, "num_objects_missing_on_primary": 0, "num_objects_missing": 0, "num_objects_degraded": 0, "num_objects_misplaced": 65210, "num_objects_unfound": 0, "num_objects_dirty": 13548, "num_whiteouts": 0, "num_read": 67760, "num_read_kb": 177798674, "num_write": 21231, "num_write_kb": 70024901, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 34481, "num_bytes_recovered": 144364295675, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0, "num_omap_bytes": 0, "num_omap_keys": 0, "num_objects_repaired": 0 }, "up": [ 5, 36, 58, 42, 18, 24 ], "acting": [ 50, 36, 5, 26, 15, 46 ], "avail_no_missing": [ "50(0)", "5(2)", "15(4)", "26(3)", "36(1)", "46(5)" ], "object_location_counts": [ { "shards": "5(2),15(4),26(3),36(1),46(5),50(0)", "objects": 13548 } ], "blocked_by": [], "up_primary": 5, "acting_primary": 50, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 16616, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, "peer_info": [ { "peer": "5(0)", "pgid": "14.ffs0", "last_update": "19550'35077", "last_complete": "17223'34381", "log_tail": "12957'31152", "last_user_version": 0, "last_backfill": "14:ff09a915:::10000001249.00000353:head", "purged_snaps": [], "history": { "epoch_created": 4537, "epoch_pool_created": 2032, "last_epoch_started": 16616, "last_interval_started": 16615, "last_epoch_clean": 14655, "last_interval_clean": 14654, "last_epoch_split": 4537, "last_epoch_marked_full": 0, "same_up_since": 16613, "same_interval_since": 16615, "same_primary_since": 16615, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "prior_readable_until_ub": 0 }, "stats": { "version": "0'0", "reported_seq": 2, "reported_epoch": 16614, "state": "peering", "last_fresh": "2022-12-26T21:27:47.993646+0000", "last_change": "2022-12-26T21:27:47.769141+0000", "last_active": "0.000000", "last_peered": "0.000000", "last_clean": "0.000000", "last_became_active": "0.000000", "last_became_peered": "0.000000", "last_unstale": "2022-12-26T21:27:47.993646+0000", "last_undegraded": "2022-12-26T21:27:47.993646+0000", "last_fullsized": "2022-12-26T21:27:47.993646+0000", "mapping_epoch": 16615, "log_start": "0'0", "ondisk_log_start": "0'0", "created": 4537, "last_epoch_clean": 14655, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "objects_scrubbed": 0, "log_size": 0, "ondisk_log_size": 0, "stats_invalid": false, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, "snaptrimq_len": 0, "last_scrub_duration": 0, "scrub_schedule": "queued for deep scrub", "scrub_duration": 0, "objects_trimmed": 0, "snaptrim_duration": 0, "stat_sum": { "num_bytes": 2115497390, "num_objects": 506, "num_object_clones": 0, "num_object_copies": 0, "num_objects_missing_on_primary": 0, "num_objects_missing": 13042, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 506, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 42, "num_write_kb": 172032, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0, "num_omap_bytes": 0, "num_omap_keys": 0, "num_objects_repaired": 0 }, "up": [ 5, 36, 58, 42, 18, 24 ], "acting": [ 50, 36, 5, 26, 15, 46 ], "avail_no_missing": [], "object_location_counts": [], "blocked_by": [ 24 ], "up_primary": 5, "acting_primary": 50, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 1, "last_epoch_started": 16616, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, { "peer": "5(2)", "pgid": "14.ffs2", "last_update": "19550'35077", "last_complete": "17223'34381", "log_tail": "12957'31152", "last_user_version": 33714, "last_backfill": "MAX", "purged_snaps": [], "history": { "epoch_created": 4537, "epoch_pool_created": 2032, "last_epoch_started": 16616, "last_interval_started": 16615, "last_epoch_clean": 14655, "last_interval_clean": 14654, "last_epoch_split": 4537, "last_epoch_marked_full": 0, "same_up_since": 16613, "same_interval_since": 16615, "same_primary_since": 16615, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "prior_readable_until_ub": 0 }, "stats": { "version": "16465'33714", "reported_seq": 388679, "reported_epoch": 16465, "state": "active+clean", "last_fresh": "2022-12-26T16:10:12.977587+0000", "last_change": "2022-12-26T02:28:03.455705+0000", "last_active": "2022-12-26T16:10:12.977587+0000", "last_peered": "2022-12-26T16:10:12.977587+0000", "last_clean": "2022-12-26T16:10:12.977587+0000", "last_became_active": "2022-12-26T02:28:03.455192+0000", "last_became_peered": "2022-12-26T02:28:03.455192+0000", "last_unstale": "2022-12-26T16:10:12.977587+0000", "last_undegraded": "2022-12-26T16:10:12.977587+0000", "last_fullsized": "2022-12-26T16:10:12.977587+0000", "mapping_epoch": 16615, "log_start": "12957'31152", "ondisk_log_start": "12957'31152", "created": 4537, "last_epoch_clean": 14655, "parent": "0.0", "parent_split_bits": 8, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "objects_scrubbed": 16227, "log_size": 2562, "ondisk_log_size": 2562, "stats_invalid": true, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, "snaptrimq_len": 0, "last_scrub_duration": 14, "scrub_schedule": "queued for deep scrub", "scrub_duration": 13.320415128, "objects_trimmed": 0, "snaptrim_duration": 0, "stat_sum": { "num_bytes": 51491812456, "num_objects": 12301, "num_object_clones": 0, "num_object_copies": 73806, "num_objects_missing_on_primary": 0, "num_objects_missing": 0, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 12301, "num_whiteouts": 0, "num_read": 66867, "num_read_kb": 177079789, "num_write": 19946, "num_write_kb": 64929471, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 34017, "num_bytes_recovered": 142424959053, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0, "num_omap_bytes": 0, "num_omap_keys": 0, "num_objects_repaired": 0 }, "up": [ 5, 36, 58, 42, 18, 24 ], "acting": [ 50, 36, 5, 26, 15, 46 ], "avail_no_missing": [], "object_location_counts": [], "blocked_by": [], "up_primary": 5, "acting_primary": 50, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 16616, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, { "peer": "15(4)", "pgid": "14.ffs4", "last_update": "19550'35077", "last_complete": "17223'34381", "log_tail": "12957'31152", "last_user_version": 33714, "last_backfill": "MAX", "purged_snaps": [], "history": { "epoch_created": 4537, "epoch_pool_created": 2032, "last_epoch_started": 16616, "last_interval_started": 16615, "last_epoch_clean": 14655, "last_interval_clean": 14654, "last_epoch_split": 4537, "last_epoch_marked_full": 0, "same_up_since": 16613, "same_interval_since": 16615, "same_primary_since": 16615, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "prior_readable_until_ub": 0 }, "stats": { "version": "16465'33714", "reported_seq": 388679, "reported_epoch": 16465, "state": "active+clean", "last_fresh": "2022-12-26T16:10:12.977587+0000", "last_change": "2022-12-26T02:28:03.455705+0000", "last_active": "2022-12-26T16:10:12.977587+0000", "last_peered": "2022-12-26T16:10:12.977587+0000", "last_clean": "2022-12-26T16:10:12.977587+0000", "last_became_active": "2022-12-26T02:28:03.455192+0000", "last_became_peered": "2022-12-26T02:28:03.455192+0000", "last_unstale": "2022-12-26T16:10:12.977587+0000", "last_undegraded": "2022-12-26T16:10:12.977587+0000", "last_fullsized": "2022-12-26T16:10:12.977587+0000", "mapping_epoch": 16615, "log_start": "12957'31152", "ondisk_log_start": "12957'31152", "created": 4537, "last_epoch_clean": 14655, "parent": "0.0", "parent_split_bits": 8, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "objects_scrubbed": 16227, "log_size": 2562, "ondisk_log_size": 2562, "stats_invalid": true, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, "snaptrimq_len": 0, "last_scrub_duration": 14, "scrub_schedule": "queued for deep scrub", "scrub_duration": 13.320415128, "objects_trimmed": 0, "snaptrim_duration": 0, "stat_sum": { "num_bytes": 51491812456, "num_objects": 12301, "num_object_clones": 0, "num_object_copies": 73806, "num_objects_missing_on_primary": 0, "num_objects_missing": 0, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 12301, "num_whiteouts": 0, "num_read": 66867, "num_read_kb": 177079789, "num_write": 19946, "num_write_kb": 64929471, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 34017, "num_bytes_recovered": 142424959053, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0, "num_omap_bytes": 0, "num_omap_keys": 0, "num_objects_repaired": 0 }, "up": [ 5, 36, 58, 42, 18, 24 ], "acting": [ 50, 36, 5, 26, 15, 46 ], "avail_no_missing": [], "object_location_counts": [], "blocked_by": [], "up_primary": 5, "acting_primary": 50, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 16616, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, { "peer": "18(4)", "pgid": "14.ffs4", "last_update": "19550'35077", "last_complete": "17223'34381", "log_tail": "12957'31152", "last_user_version": 0, "last_backfill": "14:ff09a915:::10000001249.00000353:head", "purged_snaps": [], "history": { "epoch_created": 4537, "epoch_pool_created": 2032, "last_epoch_started": 16616, "last_interval_started": 16615, "last_epoch_clean": 14655, "last_interval_clean": 14654, "last_epoch_split": 4537, "last_epoch_marked_full": 0, "same_up_since": 16613, "same_interval_since": 16615, "same_primary_since": 16615, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "prior_readable_until_ub": 0 }, "stats": { "version": "0'0", "reported_seq": 0, "reported_epoch": 0, "state": "unknown", "last_fresh": "0.000000", "last_change": "0.000000", "last_active": "0.000000", "last_peered": "0.000000", "last_clean": "0.000000", "last_became_active": "0.000000", "last_became_peered": "0.000000", "last_unstale": "0.000000", "last_undegraded": "0.000000", "last_fullsized": "0.000000", "mapping_epoch": 0, "log_start": "0'0", "ondisk_log_start": "0'0", "created": 0, "last_epoch_clean": 0, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "0.000000", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "0.000000", "last_clean_scrub_stamp": "0.000000", "objects_scrubbed": 0, "log_size": 0, "ondisk_log_size": 0, "stats_invalid": false, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, "snaptrimq_len": 0, "last_scrub_duration": 0, "scrub_schedule": "--", "scrub_duration": 0, "objects_trimmed": 0, "snaptrim_duration": 0, "stat_sum": { "num_bytes": 2115497390, "num_objects": 506, "num_object_clones": 0, "num_object_copies": 0, "num_objects_missing_on_primary": 0, "num_objects_missing": 13042, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 506, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 42, "num_write_kb": 172032, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0, "num_omap_bytes": 0, "num_omap_keys": 0, "num_objects_repaired": 0 }, "up": [], "acting": [], "avail_no_missing": [], "object_location_counts": [], "blocked_by": [], "up_primary": -1, "acting_primary": -1, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 1, "last_epoch_started": 16616, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, { "peer": "24(5)", "pgid": "14.ffs5", "last_update": "19550'35077", "last_complete": "17223'34381", "log_tail": "12957'31152", "last_user_version": 0, "last_backfill": "14:ff09a915:::10000001249.00000353:head", "purged_snaps": [], "history": { "epoch_created": 4537, "epoch_pool_created": 2032, "last_epoch_started": 16616, "last_interval_started": 16615, "last_epoch_clean": 14655, "last_interval_clean": 14654, "last_epoch_split": 4537, "last_epoch_marked_full": 0, "same_up_since": 16613, "same_interval_since": 16615, "same_primary_since": 16615, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "prior_readable_until_ub": 0 }, "stats": { "version": "0'0", "reported_seq": 0, "reported_epoch": 0, "state": "unknown", "last_fresh": "0.000000", "last_change": "0.000000", "last_active": "0.000000", "last_peered": "0.000000", "last_clean": "0.000000", "last_became_active": "0.000000", "last_became_peered": "0.000000", "last_unstale": "0.000000", "last_undegraded": "0.000000", "last_fullsized": "0.000000", "mapping_epoch": 0, "log_start": "0'0", "ondisk_log_start": "0'0", "created": 0, "last_epoch_clean": 0, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "0.000000", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "0.000000", "last_clean_scrub_stamp": "0.000000", "objects_scrubbed": 0, "log_size": 0, "ondisk_log_size": 0, "stats_invalid": false, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, "snaptrimq_len": 0, "last_scrub_duration": 0, "scrub_schedule": "--", "scrub_duration": 0, "objects_trimmed": 0, "snaptrim_duration": 0, "stat_sum": { "num_bytes": 2115497390, "num_objects": 506, "num_object_clones": 0, "num_object_copies": 0, "num_objects_missing_on_primary": 0, "num_objects_missing": 13042, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 506, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 42, "num_write_kb": 172032, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0, "num_omap_bytes": 0, "num_omap_keys": 0, "num_objects_repaired": 0 }, "up": [], "acting": [], "avail_no_missing": [], "object_location_counts": [], "blocked_by": [], "up_primary": -1, "acting_primary": -1, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 1, "last_epoch_started": 16616, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, { "peer": "26(3)", "pgid": "14.ffs3", "last_update": "19550'35077", "last_complete": "17223'34381", "log_tail": "12957'31152", "last_user_version": 33714, "last_backfill": "MAX", "purged_snaps": [], "history": { "epoch_created": 4537, "epoch_pool_created": 2032, "last_epoch_started": 16616, "last_interval_started": 16615, "last_epoch_clean": 14655, "last_interval_clean": 14654, "last_epoch_split": 4537, "last_epoch_marked_full": 0, "same_up_since": 16613, "same_interval_since": 16615, "same_primary_since": 16615, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "prior_readable_until_ub": 0 }, "stats": { "version": "16465'33714", "reported_seq": 388679, "reported_epoch": 16465, "state": "active+clean", "last_fresh": "2022-12-26T16:10:12.977587+0000", "last_change": "2022-12-26T02:28:03.455705+0000", "last_active": "2022-12-26T16:10:12.977587+0000", "last_peered": "2022-12-26T16:10:12.977587+0000", "last_clean": "2022-12-26T16:10:12.977587+0000", "last_became_active": "2022-12-26T02:28:03.455192+0000", "last_became_peered": "2022-12-26T02:28:03.455192+0000", "last_unstale": "2022-12-26T16:10:12.977587+0000", "last_undegraded": "2022-12-26T16:10:12.977587+0000", "last_fullsized": "2022-12-26T16:10:12.977587+0000", "mapping_epoch": 16615, "log_start": "12957'31152", "ondisk_log_start": "12957'31152", "created": 4537, "last_epoch_clean": 14655, "parent": "0.0", "parent_split_bits": 8, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "objects_scrubbed": 16227, "log_size": 2562, "ondisk_log_size": 2562, "stats_invalid": true, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, "snaptrimq_len": 0, "last_scrub_duration": 14, "scrub_schedule": "queued for deep scrub", "scrub_duration": 13.320415128, "objects_trimmed": 0, "snaptrim_duration": 0, "stat_sum": { "num_bytes": 51491812456, "num_objects": 12301, "num_object_clones": 0, "num_object_copies": 73806, "num_objects_missing_on_primary": 0, "num_objects_missing": 0, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 12301, "num_whiteouts": 0, "num_read": 66867, "num_read_kb": 177079789, "num_write": 19946, "num_write_kb": 64929471, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 34017, "num_bytes_recovered": 142424959053, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0, "num_omap_bytes": 0, "num_omap_keys": 0, "num_objects_repaired": 0 }, "up": [ 5, 36, 58, 42, 18, 24 ], "acting": [ 50, 36, 5, 26, 15, 46 ], "avail_no_missing": [], "object_location_counts": [], "blocked_by": [], "up_primary": 5, "acting_primary": 50, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 16616, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, { "peer": "36(1)", "pgid": "14.ffs1", "last_update": "19550'35077", "last_complete": "17223'34381", "log_tail": "12957'31152", "last_user_version": 33714, "last_backfill": "MAX", "purged_snaps": [], "history": { "epoch_created": 4537, "epoch_pool_created": 2032, "last_epoch_started": 16616, "last_interval_started": 16615, "last_epoch_clean": 14655, "last_interval_clean": 14654, "last_epoch_split": 4537, "last_epoch_marked_full": 0, "same_up_since": 16613, "same_interval_since": 16615, "same_primary_since": 16615, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "prior_readable_until_ub": 0 }, "stats": { "version": "16465'33714", "reported_seq": 388679, "reported_epoch": 16465, "state": "active+clean", "last_fresh": "2022-12-26T16:10:12.977587+0000", "last_change": "2022-12-26T02:28:03.455705+0000", "last_active": "2022-12-26T16:10:12.977587+0000", "last_peered": "2022-12-26T16:10:12.977587+0000", "last_clean": "2022-12-26T16:10:12.977587+0000", "last_became_active": "2022-12-26T02:28:03.455192+0000", "last_became_peered": "2022-12-26T02:28:03.455192+0000", "last_unstale": "2022-12-26T16:10:12.977587+0000", "last_undegraded": "2022-12-26T16:10:12.977587+0000", "last_fullsized": "2022-12-26T16:10:12.977587+0000", "mapping_epoch": 16615, "log_start": "12957'31152", "ondisk_log_start": "12957'31152", "created": 4537, "last_epoch_clean": 14655, "parent": "0.0", "parent_split_bits": 8, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "objects_scrubbed": 16227, "log_size": 2562, "ondisk_log_size": 2562, "stats_invalid": true, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, "snaptrimq_len": 0, "last_scrub_duration": 14, "scrub_schedule": "queued for deep scrub", "scrub_duration": 13.320415128, "objects_trimmed": 0, "snaptrim_duration": 0, "stat_sum": { "num_bytes": 51491812456, "num_objects": 12301, "num_object_clones": 0, "num_object_copies": 73806, "num_objects_missing_on_primary": 0, "num_objects_missing": 0, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 12301, "num_whiteouts": 0, "num_read": 66867, "num_read_kb": 177079789, "num_write": 19946, "num_write_kb": 64929471, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 34017, "num_bytes_recovered": 142424959053, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0, "num_omap_bytes": 0, "num_omap_keys": 0, "num_objects_repaired": 0 }, "up": [ 5, 36, 58, 42, 18, 24 ], "acting": [ 50, 36, 5, 26, 15, 46 ], "avail_no_missing": [], "object_location_counts": [], "blocked_by": [], "up_primary": 5, "acting_primary": 50, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 16616, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, { "peer": "42(3)", "pgid": "14.ffs3", "last_update": "19550'35077", "last_complete": "17223'34381", "log_tail": "12957'31152", "last_user_version": 0, "last_backfill": "14:ff09a915:::10000001249.00000353:head", "purged_snaps": [], "history": { "epoch_created": 4537, "epoch_pool_created": 2032, "last_epoch_started": 16616, "last_interval_started": 16615, "last_epoch_clean": 14655, "last_interval_clean": 14654, "last_epoch_split": 4537, "last_epoch_marked_full": 0, "same_up_since": 16613, "same_interval_since": 16615, "same_primary_since": 16615, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "prior_readable_until_ub": 0 }, "stats": { "version": "0'0", "reported_seq": 0, "reported_epoch": 0, "state": "unknown", "last_fresh": "0.000000", "last_change": "0.000000", "last_active": "0.000000", "last_peered": "0.000000", "last_clean": "0.000000", "last_became_active": "0.000000", "last_became_peered": "0.000000", "last_unstale": "0.000000", "last_undegraded": "0.000000", "last_fullsized": "0.000000", "mapping_epoch": 0, "log_start": "0'0", "ondisk_log_start": "0'0", "created": 0, "last_epoch_clean": 0, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "0.000000", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "0.000000", "last_clean_scrub_stamp": "0.000000", "objects_scrubbed": 0, "log_size": 0, "ondisk_log_size": 0, "stats_invalid": false, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, "snaptrimq_len": 0, "last_scrub_duration": 0, "scrub_schedule": "--", "scrub_duration": 0, "objects_trimmed": 0, "snaptrim_duration": 0, "stat_sum": { "num_bytes": 2115497390, "num_objects": 506, "num_object_clones": 0, "num_object_copies": 0, "num_objects_missing_on_primary": 0, "num_objects_missing": 13042, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 506, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 42, "num_write_kb": 172032, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0, "num_omap_bytes": 0, "num_omap_keys": 0, "num_objects_repaired": 0 }, "up": [], "acting": [], "avail_no_missing": [], "object_location_counts": [], "blocked_by": [], "up_primary": -1, "acting_primary": -1, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 1, "last_epoch_started": 16616, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, { "peer": "46(5)", "pgid": "14.ffs5", "last_update": "19550'35077", "last_complete": "17223'34381", "log_tail": "12957'31152", "last_user_version": 33714, "last_backfill": "MAX", "purged_snaps": [], "history": { "epoch_created": 4537, "epoch_pool_created": 2032, "last_epoch_started": 16616, "last_interval_started": 16615, "last_epoch_clean": 14655, "last_interval_clean": 14654, "last_epoch_split": 4537, "last_epoch_marked_full": 0, "same_up_since": 16613, "same_interval_since": 16615, "same_primary_since": 16615, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "prior_readable_until_ub": 0 }, "stats": { "version": "16465'33714", "reported_seq": 388679, "reported_epoch": 16465, "state": "active+clean", "last_fresh": "2022-12-26T16:10:12.977587+0000", "last_change": "2022-12-26T02:28:03.455705+0000", "last_active": "2022-12-26T16:10:12.977587+0000", "last_peered": "2022-12-26T16:10:12.977587+0000", "last_clean": "2022-12-26T16:10:12.977587+0000", "last_became_active": "2022-12-26T02:28:03.455192+0000", "last_became_peered": "2022-12-26T02:28:03.455192+0000", "last_unstale": "2022-12-26T16:10:12.977587+0000", "last_undegraded": "2022-12-26T16:10:12.977587+0000", "last_fullsized": "2022-12-26T16:10:12.977587+0000", "mapping_epoch": 16615, "log_start": "12957'31152", "ondisk_log_start": "12957'31152", "created": 4537, "last_epoch_clean": 14655, "parent": "0.0", "parent_split_bits": 8, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "objects_scrubbed": 16227, "log_size": 2562, "ondisk_log_size": 2562, "stats_invalid": true, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, "snaptrimq_len": 0, "last_scrub_duration": 14, "scrub_schedule": "queued for deep scrub", "scrub_duration": 13.320415128, "objects_trimmed": 0, "snaptrim_duration": 0, "stat_sum": { "num_bytes": 51491812456, "num_objects": 12301, "num_object_clones": 0, "num_object_copies": 73806, "num_objects_missing_on_primary": 0, "num_objects_missing": 0, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 12301, "num_whiteouts": 0, "num_read": 66867, "num_read_kb": 177079789, "num_write": 19946, "num_write_kb": 64929471, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 34017, "num_bytes_recovered": 142424959053, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0, "num_omap_bytes": 0, "num_omap_keys": 0, "num_objects_repaired": 0 }, "up": [ 5, 36, 58, 42, 18, 24 ], "acting": [ 50, 36, 5, 26, 15, 46 ], "avail_no_missing": [], "object_location_counts": [], "blocked_by": [], "up_primary": 5, "acting_primary": 50, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 16616, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, { "peer": "58(2)", "pgid": "14.ffs2", "last_update": "19550'35077", "last_complete": "17223'34381", "log_tail": "12957'31152", "last_user_version": 0, "last_backfill": "14:ff09a915:::10000001249.00000353:head", "purged_snaps": [], "history": { "epoch_created": 4537, "epoch_pool_created": 2032, "last_epoch_started": 16616, "last_interval_started": 16615, "last_epoch_clean": 14655, "last_interval_clean": 14654, "last_epoch_split": 4537, "last_epoch_marked_full": 0, "same_up_since": 16613, "same_interval_since": 16615, "same_primary_since": 16615, "last_scrub": "3817'25569", "last_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "last_deep_scrub": "3756'21592", "last_deep_scrub_stamp": "2022-12-14T19:35:51.893008+0000", "last_clean_scrub_stamp": "2022-12-17T06:33:40.577932+0000", "prior_readable_until_ub": 0 }, "stats": { "version": "0'0", "reported_seq": 0, "reported_epoch": 0, "state": "unknown", "last_fresh": "0.000000", "last_change": "0.000000", "last_active": "0.000000", "last_peered": "0.000000", "last_clean": "0.000000", "last_became_active": "0.000000", "last_became_peered": "0.000000", "last_unstale": "0.000000", "last_undegraded": "0.000000", "last_fullsized": "0.000000", "mapping_epoch": 0, "log_start": "0'0", "ondisk_log_start": "0'0", "created": 0, "last_epoch_clean": 0, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "0.000000", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "0.000000", "last_clean_scrub_stamp": "0.000000", "objects_scrubbed": 0, "log_size": 0, "ondisk_log_size": 0, "stats_invalid": false, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, "snaptrimq_len": 0, "last_scrub_duration": 0, "scrub_schedule": "--", "scrub_duration": 0, "objects_trimmed": 0, "snaptrim_duration": 0, "stat_sum": { "num_bytes": 2115497390, "num_objects": 506, "num_object_clones": 0, "num_object_copies": 0, "num_objects_missing_on_primary": 0, "num_objects_missing": 13042, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 506, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 42, "num_write_kb": 172032, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0, "num_omap_bytes": 0, "num_omap_keys": 0, "num_objects_repaired": 0 }, "up": [], "acting": [], "avail_no_missing": [], "object_location_counts": [], "blocked_by": [], "up_primary": -1, "acting_primary": -1, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 1, "last_epoch_started": 16616, "hit_set_history": { "current_last_update": "0'0", "history": [] } } ], "recovery_state": [ { "name": "Started/Primary/Active", "enter_time": "2022-12-26T21:27:49.970477+0000", "might_have_unfound": [], "recovery_progress": { "backfill_targets": [ "5(0)", "18(4)", "24(5)", "42(3)", "58(2)" ], "waiting_on_backfill": [], "last_backfill_started": "14:ff09b920:::10000005377.00000209:head", "backfill_info": { "begin": "14:ff09bed4:::10000001a7f.0000182d:head", "end": "14:ff0a8fd7:::1000000072c.000001a0:head", "objects": [ { "object": "14:ff09bed4:::10000001a7f.0000182d:head", "version": "3801'24969" }, { "object": "14:ff09c414:::10000004daf.00001063:head", "version": "6812'28471" }, { "object": "14:ff09c530:::10000003957.00000189:head", "version": "9167'29571" }, { "object": "14:ff09c607:::100000011db.00000977:head", "version": "3703'16620" }, { "object": "14:ff09c922:::10000001230.000001e7:head", "version": "3728'18321" }, { "object": "14:ff09cb55:::1000000454e.000009a3:head", "version": "3706'17783" }, { "object": "14:ff09ce4d:::10000001316.00000154:head", "version": "3798'23751" }, { "object": "14:ff09d097:::10000003c4f.00000085:head", "version": "3822'26172" }, { "object": "14:ff09d449:::1000000188a.0000034a:head", "version": "4627'27150" }, { "object": "14:ff09d6b4:::10000003c4a.00000b0f:head", "version": "3822'26170" }, { "object": "14:ff09de4e:::10000001b63.000004c5:head", "version": "3608'11396" }, { "object": "14:ff09e1e3:::10000002a02.000016cb:head", "version": "12960'31316" }, { "object": "14:ff09f059:::10000002c09.00000095:head", "version": "13226'31609" }, { "object": "14:ff09f069:::1000000279e.0000012f:head", "version": "13274'31844" }, { "object": "14:ff0a0029:::10000001cd0.000000a2:head", "version": "3702'16325" }, { "object": "14:ff0a083b:::100000016c6.0000013e:head", "version": "3497'8495" }, { "object": "14:ff0a0d58:::1000000468d.00001cba:head", "version": "9168'29618" }, { "object": "14:ff0a0dfd:::10000000540.0000064a:head", "version": "3493'8171" }, { "object": "14:ff0a1149:::10000005049.00000b52:head", "version": "3529'10013" }, { "object": "14:ff0a1cdb:::100000009df.0000afec:head", "version": "2106'380" }, { "object": "14:ff0a1e1a:::1000000390b.00000106:head", "version": "6812'28456" }, { "object": "14:ff0a2180:::10000003868.000000b9:head", "version": "3706'17681" }, { "object": "14:ff0a2438:::1000000133e.00000024:head", "version": "3798'24057" }, { "object": "14:ff0a280e:::10000002ccc.000004af:head", "version": "15259'33150" }, { "object": "14:ff0a2f43:::10000003d40.00000665:head", "version": "10508'29957" }, { "object": "14:ff0a315f:::100000011db.00000caa:head", "version": "3703'16628" }, { "object": "14:ff0a3347:::10000004576.00001fb9:head", "version": "3732'19654" }, { "object": "14:ff0a3948:::100000032eb.0000011a:head", "version": "3612'11760" }, { "object": "14:ff0a3af1:::1000000275d.0000027a:head", "version": "12960'31292" }, { "object": "14:ff0a4073:::100000018fe.0000355d:head", "version": "3498'8949" }, { "object": "14:ff0a44db:::10000001a11.00000203:head", "version": "3755'21336" }, { "object": "14:ff0a4888:::1000000246e.0000360a:head", "version": "13794'32296" }, { "object": "14:ff0a4e86:::100000054b1.000019c1:head", "version": "12890'30650" }, { "object": "14:ff0a51c6:::10000003970.0000096e:head", "version": "10636'30139" }, { "object": "14:ff0a5302:::10000001871.000018eb:head", "version": "3817'25516" }, { "object": "14:ff0a56ba:::10000002848.00000266:head", "version": "13855'32461" }, { "object": "14:ff0a61f3:::10000000759.00000030:head", "version": "3703'17434" }, { "object": "14:ff0a68e3:::10000001c17.0000012c:head", "version": "3653'13629" }, { "object": "14:ff0a722a:::10000003e74.00000091:head", "version": "3778'22970" }, { "object": "14:ff0a7557:::100000044fe.00001664:head", "version": "3660'14671" }, { "object": "14:ff0a8f78:::10000001821.00000844:head", "version": "3797'23629" }, { "object": "14:ff0a8f8f:::10000001bea.00000202:head", "version": "3653'13526" } ] }, "peer_backfill_info": [ "5(0)", { "begin": "MAX", "end": "MAX", "objects": [] }, "18(4)", { "begin": "MAX", "end": "MAX", "objects": [] }, "24(5)", { "begin": "MAX", "end": "MAX", "objects": [] }, "42(3)", { "begin": "MAX", "end": "MAX", "objects": [] }, "58(2)", { "begin": "MAX", "end": "MAX", "objects": [] } ], "backfills_in_flight": [ "14:ff09b0d9:::100000033c2.00000146:head", "14:ff09b6e5:::10000004b37.00000161:head", "14:ff09b7b4:::1000000278f.00000374:head", "14:ff09b920:::10000005377.00000209:head" ], "recovering": [ "14:ff09b0d9:::100000033c2.00000146:head", "14:ff09b6e5:::10000004b37.00000161:head", "14:ff09b7b4:::1000000278f.00000374:head", "14:ff09b920:::10000005377.00000209:head" ], "pg_backend": { "recovery_ops": [ { "hoid": "14:ff09b0d9:::100000033c2.00000146:head", "v": "3677'15440", "missing_on": "5(0),18(4),24(5),42(3),58(2)", "missing_on_shards": "0,2,3,4,5", "recovery_info": "ObjectRecoveryInfo(14:ff09b0d9:::100000033c2.00000146:head@3677'15440, size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:{}, object_exist: 1)", "recovery_progress": "ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false)", "state": "WRITING", "waiting_on_pushes": "5(0)", "extent_requested": "0,8388608" }, { "hoid": "14:ff09b6e5:::10000004b37.00000161:head", "v": "3549'10993", "missing_on": "5(0),18(4),24(5),42(3),58(2)", "missing_on_shards": "0,2,3,4,5", "recovery_info": "ObjectRecoveryInfo(14:ff09b6e5:::10000004b37.00000161:head@3549'10993, size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:{}, object_exist: 1)", "recovery_progress": "ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false)", "state": "WRITING", "waiting_on_pushes": "5(0)", "extent_requested": "0,8388608" }, { "hoid": "14:ff09b7b4:::1000000278f.00000374:head", "v": "13272'31836", "missing_on": "5(0),18(4),24(5),42(3),58(2)", "missing_on_shards": "0,2,3,4,5", "recovery_info": "ObjectRecoveryInfo(14:ff09b7b4:::1000000278f.00000374:head@13272'31836, size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:{}, object_exist: 1)", "recovery_progress": "ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false)", "state": "WRITING", "waiting_on_pushes": "5(0)", "extent_requested": "0,8388608" }, { "hoid": "14:ff09b920:::10000005377.00000209:head", "v": "10440'29880", "missing_on": "5(0),18(4),24(5),42(3),58(2)", "missing_on_shards": "0,2,3,4,5", "recovery_info": "ObjectRecoveryInfo(14:ff09b920:::10000005377.00000209:head@10440'29880, size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:{}, object_exist: 1)", "recovery_progress": "ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false)", "state": "WRITING", "waiting_on_pushes": "5(0)", "extent_requested": "0,8388608" } ], "read_ops": [] } } }, { "name": "Started", "enter_time": "2022-12-26T21:27:48.908226+0000" } ], "scrubber": { "active": false, "must_scrub": true, "must_deep_scrub": true, "must_repair": false, "need_auto": false, "scrub_reg_stamp": "1.000000", "schedule": "queued for deep scrub" }, "agent_state": {} } On Wed, Dec 28, 2022 at 6:46 AM Pavin Joseph <me@xxxxxxxxxxxxxxx> wrote: > 1. This is a guess, but check /var/[lib|run]/ceph for any lock files. > 2. This is more straightforward to fix, add faster WAL/Block device/LV > for each OSD or create a fast storage pool just for metadata. Also, > experiment with MDS cache size/trim [0] settings. > > [0]: https://docs.ceph.com/en/latest/cephfs/cache-configuration/ > > On 28-Dec-22 7:23 AM, Deep Dish wrote: > > Got logging enabled as per > > https://ceph.io/en/news/blog/2022/centralized_logging/. My embedded > > grafana doesn't come up in the dashboard, but at least I have log (files) > > on my nodes. Interesting. > > > > Two issues plaguing my cluster: > > > > 1 - RGWs not manageable > > 2 - MDS_SLOW_METADATA_IO warning (impact to cephfs) > > > > Issue 1: > > > > I have 4x RGWs deployed. All started / processes running. They all > > report similar log entries: > > > > 7fcc32b6a5c0 0 deferred set uid:gid to 167:167 (ceph:ceph) > > > > 7fcc32b6a5c0 0 ceph version 17.2.5 > > (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable), process > > radosgw, pid 2 > > > > 7fcc32b6a5c0 0 framework: beast > > > > 7fcc32b6a5c0 0 framework conf key: port, val: 80 > > > > 7fcc32b6a5c0 1 radosgw_Main not setting numa affinity > > > > 7fcc32b6a5c0 1 rgw_d3n: rgw_d3n_l1_local_datacache_enabled=0 > > > > 7fcc32b6a5c0 1 D3N datacache enabled: 0 > > > > 7fcc0869a700 0 INFO: RGWReshardLock::lock found lock on > reshard.0000000011 > > to be held by another RGW process; skipping for now > > > > 7fcc0bea1700 0 lifecycle: RGWLC::process() failed to acquire lock on > lc.1, > > sleep 5, try again > > > > 7fcc0dea5700 0 lifecycle: RGWLC::process() failed to acquire lock on > lc.3, > > sleep 5, try again > > > > 7fcc0dea5700 0 lifecycle: RGWLC::process() failed to acquire lock on > > lc.16, sleep 5, try again > > > > 7fcc0dea5700 0 lifecycle: RGWLC::process() failed to acquire lock on > > lc.16, sleep 5, try again > > > > 7fcc0bea1700 0 lifecycle: RGWLC::process() failed to acquire lock on > > lc.16, sleep 5, try again > > > > 7fcc0dea5700 0 lifecycle: RGWLC::process() failed to acquire lock on > > lc.16, sleep 5, try again > > > > 7fcc0bea1700 0 lifecycle: RGWLC::process() failed to acquire lock on > > lc.16, sleep 5, try again > > > > 7fcc0dea5700 0 lifecycle: RGWLC::process() failed to acquire lock on > > lc.16, sleep 5, try again > > > > 7fcc0bea1700 0 lifecycle: RGWLC::process() failed to acquire lock on > > lc.16, sleep 5, try again > > > > 7fcc0dea5700 0 lifecycle: RGWLC::process() failed to acquire lock on > > lc.16, sleep 5, try again > > > > 7fcc0bea1700 0 lifecycle: RGWLC::process() failed to acquire lock on > > lc.16, sleep 5, try again > > (repeating) > > > > Seems like a stale lock, not previously cleaned up when the cluster was > > busy recovering and rebalancing. > > > > Issue 2: > > > > ceph health detail: > > > > [WRN] MDS_SLOW_METADATA_IO: 1 MDSs report slow metadata IOs > > > > mds.fs01.ceph02mon03.rjcxat(mds.0): 8 slow metadata IOs are blocked > > > > 30 secs, oldest blocked for 39485 secs > > > > Log entries from ceph02mon03 MDS host: > > > > 7fe36debb700 1 mds.fs01.ceph02mon03.rjcxat Updating MDS map to version > > 131271 from mon.4 > > 7fe36debb700 1 mds.fs01.ceph02mon03.rjcxat Updating MDS map to version > > 131272 from mon.4 > > 7fe36debb700 1 mds.fs01.ceph02mon03.rjcxat Updating MDS map to version > > 131273 from mon.4 > > 7fe36debb700 1 mds.fs01.ceph02mon03.rjcxat Updating MDS map to version > > 131274 from mon.4 > > 7fe36debb700 1 mds.fs01.ceph02mon03.rjcxat Updating MDS map to version > > 131275 from mon.4 > > 7fe36c6b8700 0 log_channel(cluster) log [WRN] : 1 slow requests, 1 > > included below; oldest blocked for > 33.126589 secs > > 7fe36c6b8700 0 log_channel(cluster) log [WRN] : slow request 33.126588 > > seconds old, received at 2022-12-27T19:45:45.952225+0000: > > client_request(client.55009:99980 create > > #0x10000000bc2/vzdump-qemu-30003-2022_12_27-14_43_43.log > > 2022-12-27T19:45:45.948045+0000 caller_uid=0, caller_gid=0{}) currently > > submit entry: journal_and_reply > > 7fe36debb700 1 mds.fs01.ceph02mon03.rjcxat Updating MDS map to version > > 131276 from mon.4 > > 7fe36c6b8700 0 log_channel(cluster) log [WRN] : 1 slow requests, 0 > > included below; oldest blocked for > 38.126737 secs > > 7fe36debb700 1 mds.fs01.ceph02mon03.rjcxat Updating MDS map to version > > 131277 from mon.4 > > 7fe36debb700 1 mds.fs01.ceph02mon03.rjcxat Updating MDS map to version > > 131278 from mon.4 > > 7fe36debb700 1 mds.fs01.ceph02mon03.rjcxat Updating MDS map to version > > 131279 from mon.4 > > 7fe36debb700 1 mds.fs01.ceph02mon03.rjcxat Updating MDS map to version > > 131280 from mon.4 > > > > > > I suspect that the file in the log above int's the culprit. How can I > get > > to the root cause of MDS slowdowns? > > > > > > On Tue, Dec 27, 2022 at 3:32 PM Pavin Joseph <me@xxxxxxxxxxxxxxx> wrote: > > > >> Interesting, the logs show the crash module [0] itself has crashed. > >> Something sent it a SIGINT or SIGTERM and the module didn't handle it > >> correctly due to what seems like a bug in the code. > >> > >> I haven't experienced the crash module itself crashing yet (in Quincy) > >> because nothing sent a SIG[INT|TERM] to it yet. > >> > >> So I'd continue investigating into why these signals were sent to the > >> crash module. > >> > >> To fix the crash module from crashing, go to "/usr/bin/ceph-crash" and > >> edit the handler function on line 82 like so: > >> > >> def handler(signum, frame): > >> print('*** Interrupted with signal %d ***' % signum) > >> signame = signal.Signals(signum).name > >> print(f'Signal handler called with signal {signame} ({signum})') > >> print(frame) > >> sys.exit(0) > >> > >> --- > >> > >> Once the crash module is working, perhaps you could run a "ceph crash > ls" > >> > >> Regarding podman logs, perhaps try this [1]. > >> > >> [0]: https://docs.ceph.com/en/quincy/mgr/crash/ > >> [1]: https://docs.podman.io/en/latest/markdown/podman-logs.1.html > >> > >> On 27-Dec-22 11:59 PM, Deep Dish wrote: > >>> HI Pavin, > >>> > >>> Thanks for the reply. I'm a bit at a loss honestly as this worked > >>> perfectly without any issue up until the rebalance of the cluster. > >>> Orchestrator is great. Aside from this (which I suspect is not > >>> orchestrator related), I haven't had any issues. > >>> > >>> In terms of logs, I'm not sure where to start looking in this new > >>> containerized environment as they pertain to individual ceph processes > >> -- I > >>> assumed everything would be centrally collected within orch. > >>> > >>> Connecting into the podman container of a RGW, there are no logs in > >>> /var/log/ceph aside from ceph-volume. My ceph.conf is minimal with > only > >>> monitors defined. The only log I'm able to pull is as follows: > >>> > >>> # podman logs 35d4ac5445ca > >>> > >>> INFO:ceph-crash:monitoring path /var/lib/ceph/crash, delay 600s > >>> > >>> Traceback (most recent call last): > >>> > >>> File "/usr/bin/ceph-crash", line 113, in <module> > >>> > >>> main() > >>> > >>> File "/usr/bin/ceph-crash", line 109, in main > >>> > >>> time.sleep(args.delay * 60) > >>> > >>> TypeError: handler() takes 1 positional argument but 2 were given > >>> > >>> INFO:ceph-crash:monitoring path /var/lib/ceph/crash, delay 600s > >>> > >>> > >>> > >>> Looks like the RGW daemon is crashing. How do I get logs to persist? > >> I > >>> suspect I won't be able to use orchestrator to push down the config, > and > >>> would have to manipulate within the container image itself. > >>> > >>> I also attempted to redeply the RGW containers without success. > >>> > >>> On Tue, Dec 27, 2022 at 10:39 AM Pavin Joseph <me@xxxxxxxxxxxxxxx> > >> wrote: > >>> > >>>> Here's the first things I'd check in your situation: > >>>> > >>>> 1. Logs > >>>> 2. Is the RGW HTTP server running on its port? > >>>> 3. Re-check config including authentication. > >>>> > >>>> ceph orch is too new and didn't pass muster in our own internal > testing. > >>>> You're braver than most for using it in production. > >>>> > >>>> Pavin. > >>>> > >>>> On 27-Dec-22 8:48 PM, Deep Dish wrote: > >>>>> Quick update: > >>>>> > >>>>> - I followed documentation, and ran the following: > >>>>> > >>>>> # ceph dashboard set-rgw-credentials > >>>>> > >>>>> Error EINVAL: No RGW credentials found, please consult the > >> documentation > >>>> on > >>>>> how to enable RGW for the dashboard. > >>>>> > >>>>> > >>>>> > >>>>> - I see dashboard credentials configured (all this was working fine > >>>> before): > >>>>> > >>>>> > >>>>> # ceph dashboard get-rgw-api-access-key > >>>>> > >>>>> P?????????????????G (? commented out) > >>>>> > >>>>> > >>>>> > >>>>> Seems to me like my RGW config is non-existent / corrupted for some > >>>>> reason. When trying to curl a RGW directly I get a "connection > >> refused". > >>>>> > >>>>> > >>>>> > >>>>> On Tue, Dec 27, 2022 at 9:41 AM Deep Dish <deeepdish@xxxxxxxxx> > wrote: > >>>>> > >>>>>> I built a net-new Quincy cluster (17.2.5) using ceph orch as > follows: > >>>>>> > >>>>>> 2x mgrs > >>>>>> 4x rgw > >>>>>> 5x mon > >>>>>> 4x rgw > >>>>>> 5x mds > >>>>>> 6x osd hosts w/ 10 drives each --> will be growing to 7 osd hosts in > >> the > >>>>>> coming days. > >>>>>> > >>>>>> I migrated all data from my legacy nautilus cluster (via rbd-mirror, > >>>>>> rclone for s3 buckets, etc.). All moved over successfully without > >>>> issue. > >>>>>> > >>>>>> The cluster went through a series of rebalancing events (adding > >>>> capacity, > >>>>>> osd nodes, changing fault domain for EC volumes). > >>>>>> > >>>>>> It's settled now, however throughout the process all of my RGW nodes > >> are > >>>>>> no longer part of the cluster -- meaning ceph doesn't recognize / > >> detect > >>>>>> them, despite containers, networking, etc. all being setup > correctly. > >>>>>> This also means I'm unable to manage any RGW functions (via the > >>>> dashboard > >>>>>> or cli). As an example via cli (within Cephadm shell): > >>>>>> > >>>>>> # radosgw-admin pools list > >>>>>> > >>>>>> could not list placement set: (2) No such file or directory > >>>>>> > >>>>>> I have data in buckets, how can I get my RGWs to return online? > >>>>>> > >>>>> _______________________________________________ > >>>>> ceph-users mailing list -- ceph-users@xxxxxxx > >>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >>>> > >>> _______________________________________________ > >>> ceph-users mailing list -- ceph-users@xxxxxxx > >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx