Re: PG down, due to 3 OSD failing

Dan van der Ster <dvanders@xxxxxxxxx> · Fri, 1 Apr 2022 10:52:15 +0200

Don't purge anything!

On Fri, Apr 1, 2022 at 9:38 AM Fulvio Galeazzi <fulvio.galeazzi@xxxxxxx> wrote:
>
> Ciao Dan,
>      thanks for your time!
>
> So you are suggesting that my problems with PG 85.25 may somehow resolve
> if I manage to bring up the three OSDs currently "down" (possibly due to
> PG 85.12, and other PGs)?
>
> Looking for the string 'start interval does not contain the required
> bound' I found similar errors in the three OSDs:
> osd.158: 85.12s0
> osd.145: 85.33s0
> osd.121: 85.11s0
>
> Here is the output of "pg 85.12 query":
>         https://pastebin.ubuntu.com/p/ww3JdwDXVd/
>   and its status (also showing the other 85.XX, for reference):
>
> 85.11    39501        0         0       0 165479411712           0
>      0 3000                  stale+active+clean    3d    606021'532631
>    617659:1827554
> [124,157,68,72,102]p124
> [124,157,68,72,102]p124 2022-03-28 07:21:00.566032 2022-03-28
> 07:21:00.566032
> 85.12    39704    39704    158816       0 166350008320           0
>      0 3028 active+undersized+degraded+remapped    3d    606021'573200
>    620336:1839924
> [2147483647,2147483647,2147483647,2147483647,2147483647]p-1
>             [67,91,82,2147483647,112]p67 2022-03-15 03:25:28.478280
> 2022-03-12 19:10:45.866650
> 85.25    39402        0         0       0 165108592640           0
>      0 3098                 stale+down+remapped    3d    606021'521273
>    618930:1734492
> [2147483647,2147483647,2147483647,2147483647,2147483647]p-1
> [2147483647,2147483647,96,2147483647,2147483647]p96 2022-03-15
> 04:08:42.561720 2022-03-09 17:05:34.205121
> 85.33    39319        0         0       0 164740796416           0
>      0 3000                  stale+active+clean    3d    606021'513259
>    617659:2125167
> [174,112,85,102,124]p174
> [174,112,85,102,124]p174 2022-03-28 07:21:12.097873 2022-03-28
> 07:21:12.097873
>
> So 85.11 and 85.33 do not look bad, after all: why are the relevant OSDs
> complaining? Is there a way to force them (OSDs) to forget about the
> chunks they possess, as apparently those have already safely migrated
> elsewhere?
>
> Indeed 85.12 is not really healthy...
> As for chunks of 85.12 and 85.25, the 3 down OSDs have:
> osd.121
>         85.12s3
>         85.25s3
> osd.158
>         85.12s0
> osd.145
>         none
> I guess I can safely purge osd.145 and re-create it, then.
>
>
> As for the history of the pool, this is an EC pool with metadata in a
> SSD-backed replicated pool. At some point I realized I had made a
> mistake in the allocation rule for the "data" part, so I changed the
> relevant rule to:
>
> ~]$ ceph --cluster cephpa1 osd lspools | grep 85
> 85 csd-dataonly-ec-pool
> ~]$ ceph --cluster cephpa1 osd pool get csd-dataonly-ec-pool crush_rule
> crush_rule: csd-data-pool
>
> rule csd-data-pool {
>          id 5
>          type erasure
>          min_size 3
>          max_size 5
>          step set_chooseleaf_tries 5
>          step set_choose_tries 100
>          step take default class big
>          step choose indep 0 type host  <--- this was "osd", before
>          step emit
> }
>
> At the time I changed the rule, there was no 'down' PG, all PGs in the
> cluster were 'active' plus possibly some other state (remapped,
> degraded, whatever) as I had added some new disk servers few days before.
> The rule change, of course, caused some data movement and after a while
> I found those three OSDs down.
>
>    Thanks!
>
>                         Fulvio
>
>
> On 3/30/22 16:48, Dan van der Ster wrote:
> > Hi Fulvio,
> >
> > I'm not sure why that PG doesn't register.
> > But let's look into your log. The relevant lines are:
> >
> >    -635> 2022-03-30 14:49:57.810 7ff904970700 -1 log_channel(cluster)
> > log [ERR] : 85.12s0 past_intervals [616435,616454) start interval does
> > not contain the required bound [605868,616454) start
> >
> >    -628> 2022-03-30 14:49:57.810 7ff904970700 -1 osd.158 pg_epoch:
> > 616454 pg[85.12s0( empty local-lis/les=0/0 n=0 ec=616435/616435 lis/c
> > 605866/605866 les/c/f 605867/605868/0 616453/616454/616454)
> > [158,168,64,102,156]/[67,91,82,121,112]p67(0) r=-1 lpr=616454
> > pi=[616435,616454)/0 crt=0'0 remapped NOTIFY mbc={}] 85.12s0
> > past_intervals [616435,616454) start interval does not contain the
> > required bound [605868,616454) start
> >
> >    -355> 2022-03-30 14:49:57.816 7ff904970700 -1
> > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.22/rpm/el7/BUILD/ceph-14.2.22/src/osd/PG.cc:
> > In function 'void PG::check_past_interval_bounds() const' thread
> > 7ff904970700 time 2022-03-30 14:49:57.811165
> >
> > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.22/rpm/el7/BUILD/ceph-14.2.22/src/osd/PG.cc:
> > 956: ceph_abort_msg("past_interval start interval mismatch")
> >
> >
> > What is the output of `ceph pg 85.12 query` ?
> >
> > What's the history of that PG? was it moved around recently prior to this crash?
> > Are the other down osds also hosting broken parts of PG 85.12 ?
> >
> > Cheers, Dan
> >
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx