Re: PGs stuck undersized and not scrubbed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When PGs are degraded they won't scrub, further, if an OSD is involved with
recovery of another PG it wont accept scrubs either so that is the likely
explanation of your not-scrubbed-in time issue. Its of low concern.

Are you sure that recovery is not progressing? I see: "7349/147534197
objects degraded" can you check that again (maybe wait an hour) and see if
7,349 has been reduced.

Another thing I'm noticing is that OSD 57 and 79 are the primary for many
of the PGs which are degraded. They might could use a service restart.

Respectfully,

*Wes Dillingham*
wes@xxxxxxxxxxxxxxxxx
LinkedIn <http://www.linkedin.com/in/wesleydillingham>


On Mon, Jun 5, 2023 at 12:01 PM Nicola Mori <mori@xxxxxxxxxx> wrote:

> Dear Ceph users,
>
> after an outage and recovery of one machine I have several PGs stuck in
> active+recovering+undersized+degraded+remapped. Furthermore, many PGs
> have not been (deep-)scrubbed in time. See below for status and health
> details.
> It's been like this for two days, with no recovery I/O being reported,
> so I guess something is stuck in a bad state. I'd need some help in
> understanding what's going on here and how to fix it.
> Thanks,
>
> Nicola
>
> ---------------------
>
> # ceph -s
>    cluster:
>      id:     b1029256-7bb3-11ec-a8ce-ac1f6b627b45
>      health: HEALTH_WARN
>              2 OSD(s) have spurious read errors
>              Degraded data redundancy: 7349/147534197 objects degraded
> (0.005%), 22 pgs degraded, 22 pgs undersized
>              332 pgs not deep-scrubbed in time
>              503 pgs not scrubbed in time
>              (muted: OSD_SLOW_PING_TIME_BACK OSD_SLOW_PING_TIME_FRONT)
>
>    services:
>      mon: 5 daemons, quorum bofur,balin,aka,romolo,dwalin (age 2d)
>      mgr: bofur.tklnrn(active, since 32h), standbys: balin.hvunfe,
> aka.wzystq
>      mds: 2/2 daemons up, 1 standby
>      osd: 104 osds: 104 up (since 37h), 104 in (since 37h); 22 remapped pgs
>
>    data:
>      volumes: 1/1 healthy
>      pools:   3 pools, 529 pgs
>      objects: 18.53M objects, 40 TiB
>      usage:   54 TiB used, 142 TiB / 196 TiB avail
>      pgs:     7349/147534197 objects degraded (0.005%)
>               2715/147534197 objects misplaced (0.002%)
>               507 active+clean
>               20  active+recovering+undersized+degraded+remapped
>               2   active+recovery_wait+undersized+degraded+remapped
>
> # ceph health detail
> [WRN] PG_DEGRADED: Degraded data redundancy: 7349/147534197 objects
> degraded (0.005%), 22 pgs degraded, 22 pgs undersized
>      pg 3.2c is stuck undersized for 37h, current state
> active+recovery_wait+undersized+degraded+remapped, last acting
> [79,83,34,37,65,NONE,18,95]
>      pg 3.57 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,99,37,NONE,15,104,55,40]
>      pg 3.76 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,5,37,15,100,33,85,NONE]
>      pg 3.9c is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,86,88,NONE,11,69,20,10]
>      pg 3.106 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [79,15,89,NONE,36,32,23,64]
>      pg 3.107 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [79,NONE,64,20,61,92,104,43]
>      pg 3.10c is stuck undersized for 37h, current state
> active+recovery_wait+undersized+degraded+remapped, last acting
> [79,34,NONE,95,104,16,69,18]
>      pg 3.11e is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [79,89,64,46,32,NONE,40,15]
>      pg 3.14e is stuck undersized for 37h, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,34,69,97,85,NONE,46,62]
>      pg 3.160 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,1,101,84,18,33,NONE,69]
>      pg 3.16a is stuck undersized for 37h, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,16,59,103,13,38,49,NONE]
>      pg 3.16e is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,0,27,96,55,10,81,NONE]
>      pg 3.170 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [NONE,57,14,46,55,99,15,40]
>      pg 3.19b is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [NONE,79,59,8,32,17,7,90]
>      pg 3.1a0 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [NONE,79,26,50,104,24,97,40]
>      pg 3.1a5 is stuck undersized for 37h, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,100,61,27,20,NONE,24,85]
>      pg 3.1a8 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [79,24,NONE,3,55,40,98,45]
>      pg 3.1aa is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [79,91,48,NONE,24,3,8,85]
>      pg 3.1af is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [79,NONE,90,33,104,69,26,8]
>      pg 3.1c1 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [79,95,NONE,53,54,27,18,85]
>      pg 3.1c4 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [79,69,56,84,95,8,NONE,4]
>      pg 3.1d5 is stuck undersized for 37h, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,48,NONE,104,34,16,37,89]
> [WRN] PG_NOT_DEEP_SCRUBBED: 332 pgs not deep-scrubbed in time
>      pg 3.1ff not deep-scrubbed since 2023-05-18T21:06:57.883787+0000
>      pg 3.1fe not deep-scrubbed since 2023-05-22T19:50:11.497538+0000
>      pg 3.1fd not deep-scrubbed since 2023-05-22T19:44:12.680598+0000
>      pg 3.1fc not deep-scrubbed since 2023-05-20T19:56:43.746580+0000
>      pg 3.1fb not deep-scrubbed since 2023-05-22T18:29:12.794152+0000
>      pg 3.1f9 not deep-scrubbed since 2023-05-19T08:19:16.636964+0000
>      pg 3.1f8 not deep-scrubbed since 2023-05-22T21:49:28.891350+0000
>      pg 3.1f5 not deep-scrubbed since 2023-05-18T21:18:19.636068+0000
>      pg 3.1f4 not deep-scrubbed since 2023-05-18T18:00:41.241562+0000
>      pg 3.1f3 not deep-scrubbed since 2023-05-21T01:36:32.735139+0000
>      pg 3.1f2 not deep-scrubbed since 2023-05-23T03:59:02.154966+0000
>      pg 3.1f1 not deep-scrubbed since 2023-05-22T21:47:46.419880+0000
>      pg 3.1f0 not deep-scrubbed since 2023-05-22T19:17:38.327356+0000
>      pg 3.1ef not deep-scrubbed since 2023-05-19T01:49:04.133392+0000
>      pg 3.1ee not deep-scrubbed since 2023-05-21T12:25:52.010406+0000
>      pg 3.1ed not deep-scrubbed since 2023-05-19T20:13:20.675257+0000
>      pg 3.1eb not deep-scrubbed since 2023-05-18T12:13:53.684650+0000
>      pg 3.1ea not deep-scrubbed since 2023-05-18T09:45:57.172578+0000
>      pg 3.1e9 not deep-scrubbed since 2023-05-23T00:26:18.621324+0000
>      pg 3.1e8 not deep-scrubbed since 2023-05-21T05:15:03.969687+0000
>      pg 3.1e4 not deep-scrubbed since 2023-05-21T16:21:11.738145+0000
>      pg 3.1e3 not deep-scrubbed since 2023-05-22T13:13:19.611165+0000
>      pg 3.1e0 not deep-scrubbed since 2023-05-21T17:43:36.545240+0000
>      pg 3.1de not deep-scrubbed since 2023-05-18T00:03:49.873073+0000
>      pg 3.1dd not deep-scrubbed since 2023-05-22T20:30:56.025015+0000
>      pg 3.1db not deep-scrubbed since 2023-05-22T18:12:44.615539+0000
>      pg 3.1da not deep-scrubbed since 2023-05-20T21:11:00.060022+0000
>      pg 3.1d9 not deep-scrubbed since 2023-05-22T19:02:03.292022+0000
>      pg 3.1d8 not deep-scrubbed since 2023-05-23T17:37:05.320161+0000
>      pg 3.1d6 not deep-scrubbed since 2023-05-19T15:19:58.293551+0000
>      pg 3.1d4 not deep-scrubbed since 2023-05-23T02:28:54.392188+0000
>      pg 3.1d3 not deep-scrubbed since 2023-05-18T06:02:14.181321+0000
>      pg 3.1d2 not deep-scrubbed since 2023-05-18T11:46:29.582700+0000
>      pg 3.1d1 not deep-scrubbed since 2023-05-19T08:31:54.033426+0000
>      pg 3.1cd not deep-scrubbed since 2023-05-21T08:52:41.817826+0000
>      pg 3.1cc not deep-scrubbed since 2023-05-22T22:51:02.466708+0000
>      pg 3.1c9 not deep-scrubbed since 2023-05-18T08:06:50.220587+0000
>      pg 3.1c7 not deep-scrubbed since 2023-05-22T17:07:35.346608+0000
>      pg 3.1c5 not deep-scrubbed since 2023-05-20T17:09:12.048012+0000
>      pg 3.1c1 not deep-scrubbed since 2023-05-21T11:39:47.640196+0000
>      pg 3.1c0 not deep-scrubbed since 2023-05-22T20:22:57.166475+0000
>      pg 3.1bf not deep-scrubbed since 2023-05-19T19:08:08.313143+0000
>      pg 3.1be not deep-scrubbed since 2023-05-21T12:28:17.345386+0000
>      pg 3.1bd not deep-scrubbed since 2023-05-18T19:19:29.002801+0000
>      pg 3.1bb not deep-scrubbed since 2023-05-19T07:15:53.508751+0000
>      pg 3.1b8 not deep-scrubbed since 2023-05-19T18:50:27.701909+0000
>      pg 3.1b6 not deep-scrubbed since 2023-05-19T03:30:55.707248+0000
>      pg 3.1b5 not deep-scrubbed since 2023-05-20T20:37:48.346272+0000
>      pg 3.1b4 not deep-scrubbed since 2023-05-23T02:11:04.833784+0000
>      pg 3.1b3 not deep-scrubbed since 2023-05-18T20:46:40.876590+0000
>      282 more pgs...
> [WRN] PG_NOT_SCRUBBED: 503 pgs not scrubbed in time
>      pg 3.1ff not scrubbed since 2023-05-24T23:37:22.323516+0000
>      pg 3.1fe not scrubbed since 2023-05-25T02:01:18.754476+0000
>      pg 3.1fd not scrubbed since 2023-05-24T20:31:23.239794+0000
>      pg 3.1fc not scrubbed since 2023-05-25T00:42:05.670791+0000
>      pg 3.1fb not scrubbed since 2023-05-24T19:29:29.438626+0000
>      pg 3.1fa not scrubbed since 2023-05-24T21:50:04.911965+0000
>      pg 3.1f9 not scrubbed since 2023-05-25T20:44:49.010622+0000
>      pg 3.1f8 not scrubbed since 2023-05-24T18:17:49.471926+0000
>      pg 3.1f7 not scrubbed since 2023-05-24T17:27:43.545337+0000
>      pg 3.1f6 not scrubbed since 2023-05-24T22:16:04.008644+0000
>      pg 3.1f5 not scrubbed since 2023-05-24T20:14:01.159271+0000
>      pg 3.1f4 not scrubbed since 2023-05-24T16:20:29.746958+0000
>      pg 3.1f3 not scrubbed since 2023-05-25T00:45:49.464448+0000
>      pg 3.1f2 not scrubbed since 2023-05-24T17:37:58.701570+0000
>      pg 3.1f1 not scrubbed since 2023-05-24T20:21:46.824657+0000
>      pg 3.1f0 not scrubbed since 2023-05-25T00:59:02.693836+0000
>      pg 3.1ef not scrubbed since 2023-05-24T21:35:10.061965+0000
>      pg 3.1ee not scrubbed since 2023-05-24T17:13:37.835095+0000
>      pg 3.1ed not scrubbed since 2023-05-24T18:17:21.739348+0000
>      pg 3.1ec not scrubbed since 2023-05-24T17:54:23.365899+0000
>      pg 3.1eb not scrubbed since 2023-05-24T23:18:31.345229+0000
>      pg 3.1ea not scrubbed since 2023-05-25T00:25:06.747723+0000
>      pg 3.1e9 not scrubbed since 2023-05-25T19:27:39.496774+0000
>      pg 3.1e8 not scrubbed since 2023-05-25T01:31:11.083814+0000
>      pg 3.1e7 not scrubbed since 2023-05-25T01:43:43.116599+0000
>      pg 3.1e6 not scrubbed since 2023-05-24T18:26:39.778008+0000
>      pg 3.1e4 not scrubbed since 2023-05-24T22:18:59.986309+0000
>      pg 3.1e3 not scrubbed since 2023-05-24T14:34:52.095564+0000
>      pg 3.1e2 not scrubbed since 2023-05-24T23:56:04.083842+0000
>      pg 3.1e1 not scrubbed since 2023-05-25T02:00:18.766811+0000
>      pg 3.1e0 not scrubbed since 2023-05-25T02:01:42.094304+0000
>      pg 3.1df not scrubbed since 2023-05-24T19:41:59.890557+0000
>      pg 3.1de not scrubbed since 2023-05-24T23:57:49.463552+0000
>      pg 3.1dd not scrubbed since 2023-05-25T17:42:33.397660+0000
>      pg 3.1dc not scrubbed since 2023-05-24T17:34:43.656366+0000
>      pg 3.1db not scrubbed since 2023-05-24T21:48:10.126232+0000
>      pg 3.1da not scrubbed since 2023-05-24T17:54:43.136739+0000
>      pg 3.1d9 not scrubbed since 2023-05-24T20:22:14.256914+0000
>      pg 3.1d8 not scrubbed since 2023-05-24T23:34:56.555311+0000
>      pg 3.1d7 not scrubbed since 2023-05-25T18:08:08.689329+0000
>      pg 3.1d6 not scrubbed since 2023-05-24T20:23:30.301130+0000
>      pg 3.1d5 not scrubbed since 2023-05-25T20:30:25.691077+0000
>      pg 3.1d4 not scrubbed since 2023-05-24T21:21:46.923743+0000
>      pg 3.1d3 not scrubbed since 2023-05-24T18:12:50.468466+0000
>      pg 3.1d2 not scrubbed since 2023-05-24T20:33:32.376232+0000
>      pg 3.1d1 not scrubbed since 2023-05-24T20:32:55.981738+0000
>      pg 3.1d0 not scrubbed since 2023-05-24T18:16:51.195524+0000
>      pg 3.1cf not scrubbed since 2023-05-24T22:32:00.879058+0000
>      pg 3.1ce not scrubbed since 2023-05-25T02:46:02.834267+0000
>      pg 3.1cd not scrubbed since 2023-05-24T21:02:08.288116+0000
>      453 more pgs...
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux