Re: PG scrub stamps reset to 0.000000 in 14.2.1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

I'm also affected by this:

HEALTH_WARN 13 pgs not deep-scrubbed in time; 13 pgs not scrubbed in time
PG_NOT_DEEP_SCRUBBED 13 pgs not deep-scrubbed in time
    pg 6.b1 not deep-scrubbed since 0.000000
    pg 7.ac not deep-scrubbed since 0.000000
    pg 7.a0 not deep-scrubbed since 0.000000
    pg 6.96 not deep-scrubbed since 0.000000
    pg 7.92 not deep-scrubbed since 0.000000
    pg 6.86 not deep-scrubbed since 0.000000
    pg 7.74 not deep-scrubbed since 0.000000
    pg 7.75 not deep-scrubbed since 0.000000
    pg 7.49 not deep-scrubbed since 0.000000
    pg 7.47 not deep-scrubbed since 0.000000
    pg 6.2a not deep-scrubbed since 0.000000
    pg 6.26 not deep-scrubbed since 0.000000
    pg 6.b not deep-scrubbed since 0.000000
PG_NOT_SCRUBBED 13 pgs not scrubbed in time
    pg 6.b1 not scrubbed since 0.000000
    pg 7.ac not scrubbed since 0.000000
    pg 7.a0 not scrubbed since 0.000000
    pg 6.96 not scrubbed since 0.000000
    pg 7.92 not scrubbed since 0.000000
    pg 6.86 not scrubbed since 0.000000
    pg 7.74 not scrubbed since 0.000000
    pg 7.75 not scrubbed since 0.000000
    pg 7.49 not scrubbed since 0.000000
    pg 7.47 not scrubbed since 0.000000
    pg 6.2a not scrubbed since 0.000000
    pg 6.26 not scrubbed since 0.000000
    pg 6.b not scrubbed since 0.000000


A week ago this status was:


HEALTH_WARN 6 pgs not deep-scrubbed in time; 6 pgs not scrubbed in time
PG_NOT_DEEP_SCRUBBED 6 pgs not deep-scrubbed in time
    pg 7.b1 not deep-scrubbed since 0.000000
    pg 7.7e not deep-scrubbed since 0.000000
    pg 6.6e not deep-scrubbed since 0.000000
    pg 7.8 not deep-scrubbed since 0.000000
    pg 7.40 not deep-scrubbed since 0.000000
    pg 6.f5 not deep-scrubbed since 0.000000
PG_NOT_SCRUBBED 6 pgs not scrubbed in time
    pg 7.b1 not scrubbed since 0.000000
    pg 7.7e not scrubbed since 0.000000
    pg 6.6e not scrubbed since 0.000000
    pg 7.8 not scrubbed since 0.000000
    pg 7.40 not scrubbed since 0.000000
    pg 6.f5 not scrubbed since 0.000000


Is this a known problem already? I can't find a bug report.


Cheers

-- Jonas



On 16/05/2019 01.13, Brett Chancellor wrote:
> After upgrading from 14.2.0 to 14.2.1, I've noticed PGs are frequently resetting their scrub and deep scrub time stamps
> to 0.000000.  It's extra strange because the peers show timestamps for deep scrubs.
> 
> ## First entry from a pg list at 7pm
> $ grep 11.2f2 ~/pgs-active.7pm 
> 11.2f2     691        0         0       0 2897477632           0          0 2091 active+clean    3h  7378'12291 
>  8048:36261    [1,6,37]p1    [1,6,37]p1 2019-05-14 21:01:29.172460 2019-05-14 21:01:29.172460 
> 
> ## Next Entry 3 minutes later
> $ ceph pg ls active |grep 11.2f2
> 11.2f2     695        0         0       0 2914713600           0          0 2091 active+clean    6s  7378'12291 
>  8049:36330    [1,6,37]p1    [1,6,37]p1                   0.000000                   0.000000 
> 
> ## PG Query
> {
>     "state": "active+clean",
>     "snap_trimq": "[]",
>     "snap_trimq_len": 0,
>     "epoch": 8049,
>     "up": [
>         1,
>         6,
>         37
>     ],
>     "acting": [
>         1,
>         6,
>         37
>     ],
>     "acting_recovery_backfill": [
>         "1",
>         "6",
>         "37"
>     ],
>     "info": {
>         "pgid": "11.2f2",
>         "last_update": "7378'12291",
>         "last_complete": "7378'12291",
>         "log_tail": "1087'10200",
>         "last_user_version": 12291,
>         "last_backfill": "MAX",
>         "last_backfill_bitwise": 1,
>         "purged_snaps": [],
>         "history": {
>             "epoch_created": 1549,
>             "epoch_pool_created": 216,
>             "last_epoch_started": 6148,
>             "last_interval_started": 6147,
>             "last_epoch_clean": 6148,
>             "last_interval_clean": 6147,
>             "last_epoch_split": 6147,
>             "last_epoch_marked_full": 0,
>             "same_up_since": 6126,
>             "same_interval_since": 6147,
>             "same_primary_since": 6126,
>             "last_scrub": "7378'12291",
>             "last_scrub_stamp": "0.000000",
>             "last_deep_scrub": "6103'12186",
>             "last_deep_scrub_stamp": "0.000000",
>             "last_clean_scrub_stamp": "2019-05-15 23:08:17.014575"
>         },
>         "stats": {
>             "version": "7378'12291",
>             "reported_seq": "36700",
>             "reported_epoch": "8049",
>             "state": "active+clean",
>             "last_fresh": "2019-05-15 23:08:17.014609",
>             "last_change": "2019-05-15 23:08:17.014609",
>             "last_active": "2019-05-15 23:08:17.014609",
>             "last_peered": "2019-05-15 23:08:17.014609",
>             "last_clean": "2019-05-15 23:08:17.014609",
>             "last_became_active": "2019-05-15 19:25:01.484322",
>             "last_became_peered": "2019-05-15 19:25:01.484322",
>             "last_unstale": "2019-05-15 23:08:17.014609",
>             "last_undegraded": "2019-05-15 23:08:17.014609",
>             "last_fullsized": "2019-05-15 23:08:17.014609",
>             "mapping_epoch": 6126,
>             "log_start": "1087'10200",
>             "ondisk_log_start": "1087'10200",
>             "created": 1549,
>             "last_epoch_clean": 6148,
>             "parent": "0.0",
>             "parent_split_bits": 10,
>             "last_scrub": "7378'12291",
>             "last_scrub_stamp": "0.000000",
>             "last_deep_scrub": "6103'12186",
>             "last_deep_scrub_stamp": "0.000000",
>             "last_clean_scrub_stamp": "2019-05-15 23:08:17.014575",
>             "log_size": 2091,
>             "ondisk_log_size": 2091,
>             "stats_invalid": false,
>             "dirty_stats_invalid": false,
>             "omap_stats_invalid": false,
>             "hitset_stats_invalid": false,
>             "hitset_bytes_stats_invalid": false,
>             "pin_stats_invalid": false,
>             "manifest_stats_invalid": true,
>             "snaptrimq_len": 0,
>             "stat_sum": {
>                 "num_bytes": 2914713600,
>                 "num_objects": 695,
>                 "num_object_clones": 0,
>                 "num_object_copies": 2085,
>                 "num_objects_missing_on_primary": 0,
>                 "num_objects_missing": 0,
>                 "num_objects_degraded": 0,
>                 "num_objects_misplaced": 0,
>                 "num_objects_unfound": 0,
>                 "num_objects_dirty": 695,
>                 "num_whiteouts": 0,
>                 "num_read": 0,
>                 "num_read_kb": 0,
>                 "num_write": 0,
>                 "num_write_kb": 0,
>                 "num_scrub_errors": 0,
>                 "num_shallow_scrub_errors": 0,
>                 "num_deep_scrub_errors": 0,
>                 "num_objects_recovered": 0,
>                 "num_bytes_recovered": 0,
>                 "num_keys_recovered": 0,
>                 "num_objects_omap": 0,
>                 "num_objects_hit_set_archive": 0,
>                 "num_bytes_hit_set_archive": 0,
>                 "num_flush": 0,
>                 "num_flush_kb": 0,
>                 "num_evict": 0,
>                 "num_evict_kb": 0,
>                 "num_promote": 0,
>                 "num_flush_mode_high": 0,
>                 "num_flush_mode_low": 0,
>                 "num_evict_mode_some": 0,
>                 "num_evict_mode_full": 0,
>                 "num_objects_pinned": 0,
>                 "num_legacy_snapsets": 0,
>                 "num_large_omap_objects": 0,
>                 "num_objects_manifest": 0,
>                 "num_omap_bytes": 0,
>                 "num_omap_keys": 0,
>                 "num_objects_repaired": 0
>             },
>             "up": [
>                 1,
>                 6,
>                 37
>             ],
>             "acting": [
>                 1,
>                 6,
>                 37
>             ],
>             "blocked_by": [],
>             "up_primary": 1,
>             "acting_primary": 1,
>             "purged_snaps": []
>         },
>         "empty": 0,
>         "dne": 0,
>         "incomplete": 0,
>         "last_epoch_started": 6148,
>         "hit_set_history": {
>             "current_last_update": "0'0",
>             "history": []
>         }
>     },
>     "peer_info": [
>         {
>             "peer": "6",
>             "pgid": "11.2f2",
>             "last_update": "7378'12291",
>             "last_complete": "7378'12291",
>             "log_tail": "1087'10200",
>             "last_user_version": 12264,
>             "last_backfill": "MAX",
>             "last_backfill_bitwise": 1,
>             "purged_snaps": [],
>             "history": {
>                 "epoch_created": 1549,
>                 "epoch_pool_created": 216,
>                 "last_epoch_started": 6148,
>                 "last_interval_started": 6147,
>                 "last_epoch_clean": 6148,
>                 "last_interval_clean": 6147,
>                 "last_epoch_split": 6147,
>                 "last_epoch_marked_full": 0,
>                 "same_up_since": 6126,
>                 "same_interval_since": 6147,
>                 "same_primary_since": 6126,
>                 "last_scrub": "7378'12291",
>                 "last_scrub_stamp": "2019-05-14 21:01:29.172460",
>                 "last_deep_scrub": "6103'12186",
>                 "last_deep_scrub_stamp": "2019-05-14 21:01:29.172460",
>                 "last_clean_scrub_stamp": "2019-05-15 23:08:17.014575"
>             },
>             "stats": {
>                 "version": "6103'12264",
>                 "reported_seq": "34823",
>                 "reported_epoch": "6125",
>                 "state": "active+undersized+degraded",
>                 "last_fresh": "2019-05-15 19:16:05.309556",
>                 "last_change": "2019-05-15 19:15:59.342872",
>                 "last_active": "2019-05-15 19:16:05.309556",
>                 "last_peered": "2019-05-15 19:16:05.309556",
>                 "last_clean": "2019-05-15 17:23:41.956141",
>                 "last_became_active": "2019-05-15 19:15:59.342872",
>                 "last_became_peered": "2019-05-15 19:15:59.342872",
>                 "last_unstale": "2019-05-15 19:16:05.309556",
>                 "last_undegraded": "2019-05-15 19:15:59.341684",
>                 "last_fullsized": "2019-05-15 19:15:59.340270",
>                 "mapping_epoch": 6126,
>                 "log_start": "1087'10200",
>                 "ondisk_log_start": "1087'10200",
>                 "created": 1549,
>                 "last_epoch_clean": 6115,
>                 "parent": "0.0",
>                 "parent_split_bits": 10,
>                 "last_scrub": "6103'12186",
>                 "last_scrub_stamp": "2019-05-14 21:01:29.172460",
>                 "last_deep_scrub": "6103'12186",
>                 "last_deep_scrub_stamp": "2019-05-14 21:01:29.172460",
>                 "last_clean_scrub_stamp": "2019-05-14 21:01:29.172460",
>                 "log_size": 2064,
>                 "ondisk_log_size": 2064,
>                 "stats_invalid": true,
>                 "dirty_stats_invalid": false,
>                 "omap_stats_invalid": false,
>                 "hitset_stats_invalid": false,
>                 "hitset_bytes_stats_invalid": false,
>                 "pin_stats_invalid": false,
>                 "manifest_stats_invalid": true,
>                 "snaptrimq_len": 0,
>                 "stat_sum": {
>                     "num_bytes": 2784231424,
>                     "num_objects": 664,
>                     "num_object_clones": 0,
>                     "num_object_copies": 1992,
>                     "num_objects_missing_on_primary": 0,
>                     "num_objects_missing": 0,
>                     "num_objects_degraded": 664,
>                     "num_objects_misplaced": 0,
>                     "num_objects_unfound": 0,
>                     "num_objects_dirty": 664,
>                     "num_whiteouts": 0,
>                     "num_read": 93,
>                     "num_read_kb": 350209,
>                     "num_write": 153,
>                     "num_write_kb": 604160,
>                     "num_scrub_errors": 0,
>                     "num_shallow_scrub_errors": 0,
>                     "num_deep_scrub_errors": 0,
>                     "num_objects_recovered": 129,
>                     "num_bytes_recovered": 540901376,
>                     "num_keys_recovered": 0,
>                     "num_objects_omap": 0,
>                     "num_objects_hit_set_archive": 0,
>                     "num_bytes_hit_set_archive": 0,
>                     "num_flush": 0,
>                     "num_flush_kb": 0,
>                     "num_evict": 0,
>                     "num_evict_kb": 0,
>                     "num_promote": 0,
>                     "num_flush_mode_high": 0,
>                     "num_flush_mode_low": 0,
>                     "num_evict_mode_some": 0,
>                     "num_evict_mode_full": 0,
>                     "num_objects_pinned": 0,
>                     "num_legacy_snapsets": 0,
>                     "num_large_omap_objects": 0,
>                     "num_objects_manifest": 0,
>                     "num_omap_bytes": 0,
>                     "num_omap_keys": 0,
>                     "num_objects_repaired": 0
>                 },
>                 "up": [
>                     1,
>                     6,
>                     37
>                 ],
>                 "acting": [
>                     1,
>                     6,
>                     37
>                 ],
>                 "blocked_by": [],
>                 "up_primary": 1,
>                 "acting_primary": 1,
>                 "purged_snaps": []
>             },
>             "empty": 0,
>             "dne": 0,
>             "incomplete": 0,
>             "last_epoch_started": 6148,
>             "hit_set_history": {
>                 "current_last_update": "0'0",
>                 "history": []
>             }
>         },
>         {
>             "peer": "37",
>             "pgid": "11.2f2",
>             "last_update": "7378'12291",
>             "last_complete": "7378'12291",
>             "log_tail": "1087'10200",
>             "last_user_version": 12264,
>             "last_backfill": "MAX",
>             "last_backfill_bitwise": 1,
>             "purged_snaps": [],
>             "history": {
>                 "epoch_created": 1549,
>                 "epoch_pool_created": 216,
>                 "last_epoch_started": 6148,
>                 "last_interval_started": 6147,
>                 "last_epoch_clean": 6148,
>                 "last_interval_clean": 6147,
>                 "last_epoch_split": 6147,
>                 "last_epoch_marked_full": 0,
>                 "same_up_since": 6126,
>                 "same_interval_since": 6147,
>                 "same_primary_since": 6126,
>                 "last_scrub": "7378'12291",
>                 "last_scrub_stamp": "2019-05-14 21:01:29.172460",
>                 "last_deep_scrub": "6103'12186",
>                 "last_deep_scrub_stamp": "2019-05-14 21:01:29.172460",
>                 "last_clean_scrub_stamp": "2019-05-15 23:08:17.014575"
>             },
>             "stats": {
>                 "version": "6103'12263",
>                 "reported_seq": "34275",
>                 "reported_epoch": "6103",
>                 "state": "active+clean",
>                 "last_fresh": "2019-05-15 17:23:41.956141",
>                 "last_change": "2019-05-14 21:01:29.172506",
>                 "last_active": "2019-05-15 17:23:41.956141",
>                 "last_peered": "2019-05-15 17:23:41.956141",
>                 "last_clean": "2019-05-15 17:23:41.956141",
>                 "last_became_active": "2019-05-14 20:16:44.641437",
>                 "last_became_peered": "2019-05-14 20:16:44.641437",
>                 "last_unstale": "2019-05-15 17:23:41.956141",
>                 "last_undegraded": "2019-05-15 17:23:41.956141",
>                 "last_fullsized": "2019-05-15 17:23:41.956141",
>                 "mapping_epoch": 6126,
>                 "log_start": "1087'10200",
>                 "ondisk_log_start": "1087'10200",
>                 "created": 1549,
>                 "last_epoch_clean": 6010,
>                 "parent": "0.0",
>                 "parent_split_bits": 10,
>                 "last_scrub": "6103'12186",
>                 "last_scrub_stamp": "2019-05-14 21:01:29.172460",
>                 "last_deep_scrub": "6103'12186",
>                 "last_deep_scrub_stamp": "2019-05-14 21:01:29.172460",
>                 "last_clean_scrub_stamp": "2019-05-14 21:01:29.172460",
>                 "log_size": 2063,
>                 "ondisk_log_size": 2063,
>                 "stats_invalid": true,
>                 "dirty_stats_invalid": false,
>                 "omap_stats_invalid": false,
>                 "hitset_stats_invalid": false,
>                 "hitset_bytes_stats_invalid": false,
>                 "pin_stats_invalid": false,
>                 "manifest_stats_invalid": true,
>                 "snaptrimq_len": 0,
>                 "stat_sum": {
>                     "num_bytes": 2784231424,
>                     "num_objects": 664,
>                     "num_object_clones": 0,
>                     "num_object_copies": 1993,
>                     "num_objects_missing_on_primary": 0,
>                     "num_objects_missing": 0,
>                     "num_objects_degraded": 0,
>                     "num_objects_misplaced": 0,
>                     "num_objects_unfound": 0,
>                     "num_objects_dirty": 664,
>                     "num_whiteouts": 0,
>                     "num_read": 93,
>                     "num_read_kb": 350209,
>                     "num_write": 153,
>                     "num_write_kb": 604160,
>                     "num_scrub_errors": 0,
>                     "num_shallow_scrub_errors": 0,
>                     "num_deep_scrub_errors": 0,
>                     "num_objects_recovered": 0,
>                     "num_bytes_recovered": 0,
>                     "num_keys_recovered": 0,
>                     "num_objects_omap": 0,
>                     "num_objects_hit_set_archive": 0,
>                     "num_bytes_hit_set_archive": 0,
>                     "num_flush": 0,
>                     "num_flush_kb": 0,
>                     "num_evict": 0,
>                     "num_evict_kb": 0,
>                     "num_promote": 0,
>                     "num_flush_mode_high": 0,
>                     "num_flush_mode_low": 0,
>                     "num_evict_mode_some": 0,
>                     "num_evict_mode_full": 0,
>                     "num_objects_pinned": 0,
>                     "num_legacy_snapsets": 0,
>                     "num_large_omap_objects": 0,
>                     "num_objects_manifest": 0,
>                     "num_omap_bytes": 0,
>                     "num_omap_keys": 0,
>                     "num_objects_repaired": 0
>                 },
>                 "up": [
>                     1,
>                     6,
>                     37
>                 ],
>                 "acting": [
>                     1,
>                     6,
>                     37
>                 ],
>                 "blocked_by": [],
>                 "up_primary": 1,
>                 "acting_primary": 1,
>                 "purged_snaps": []
>             },
>             "empty": 0,
>             "dne": 0,
>             "incomplete": 0,
>             "last_epoch_started": 6148,
>             "hit_set_history": {
>                 "current_last_update": "0'0",
>                 "history": []
>             }
>         }
>     ],
>     "recovery_state": [
>         {
>             "name": "Started/Primary/Active",
>             "enter_time": "2019-05-15 19:25:01.479877",
>             "might_have_unfound": [],
>             "recovery_progress": {
>                 "backfill_targets": [],
>                 "waiting_on_backfill": [],
>                 "last_backfill_started": "MIN",
>                 "backfill_info": {
>                     "begin": "MIN",
>                     "end": "MIN",
>                     "objects": []
>                 },
>                 "peer_backfill_info": [],
>                 "backfills_in_flight": [],
>                 "recovering": [],
>                 "pg_backend": {
>                     "pull_from_peer": [],
>                     "pushing": []
>                 }
>             },
>             "scrub": {
>                 "scrubber.epoch_start": "6147",
>                 "scrubber.active": false,
>                 "scrubber.state": "INACTIVE",
>                 "scrubber.start": "MIN",
>                 "scrubber.end": "MIN",
>                 "scrubber.max_end": "MIN",
>                 "scrubber.subset_last_update": "0'0",
>                 "scrubber.deep": false,
>                 "scrubber.waiting_on_whom": []
>             }
>         },
>         {
>             "name": "Started",
>             "enter_time": "2019-05-15 19:25:00.557042"
>         }
>     ],
>     "agent_state": {}
> }
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux