Re: Weird blocked OP issue.

Robert LeBlanc <robert@xxxxxxxxxxxxx> · Fri, 1 Nov 2019 23:43:39 -0700

On Fri, Nov 1, 2019 at 6:10 PM Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
>
> We had an OSD host with 13 OSDs fail today and we have a weird blocked
> OP message that I can't understand. There are no OSDs with blocked
> ops, just `mon` (multiple times), and some of the rgw instances.
>
>   cluster:
>    id:     570bcdbb-9fdf-406f-9079-b0181025f8d0
>    health: HEALTH_WARN
>            1 large omap objects
>            Degraded data redundancy: 2083023/195702437 objects
> degraded (1.064%), 880 pgs degraded, 880 pgs undersized
>            1609 pgs not deep-scrubbed in time
>            4 slow ops, oldest one blocked for 506699 sec, daemons
> [mon,sun-gcs02-rgw01,mon,sun-gcs02-rgw02,mon,sun-gcs02-rgw03] have
> slow ops.
>
>  services:
>    mon: 3 daemons, quorum
> sun-gcs02-rgw01,sun-gcs02-rgw02,sun-gcs02-rgw03 (age 6m)
>    mgr: sun-gcs02-rgw02(active, since 5d), standbys: sun-gcs02-rgw03,
> sun-gcs02-rgw04
>    osd: 767 osds: 754 up (since 10m), 754 in (since 104m); 880 remapped pgs
>    rgw: 16 daemons active (sun-gcs02-rgw01.rgw0, sun-gcs02-rgw01.rgw1,
> sun-gcs02-rgw01.rgw2, sun-gcs02-rgw01.rgw3, sun-gcs02-rgw02.rgw0,
> sun-gcs02-rgw02.rgw1, sun-gcs02-rgw02.rgw2, sun-gcs02-rgw02.rgw3,
> sun-gcs02-rgw03.rgw0, sun-gcs02-rgw03.rgw1, sun-gcs02-rgw03.rgw2, s
> un-gcs02-rgw03.rgw3, sun-gcs02-rgw04.rgw0, sun-gcs02-rgw04.rgw1,
> sun-gcs02-rgw04.rgw2, sun-gcs02-rgw04.rgw3)
>
>  data:
>    pools:   7 pools, 8240 pgs
>    objects: 19.57M objects, 52 TiB
>    usage:   88 TiB used, 6.1 PiB / 6.2 PiB avail
>    pgs:     2083023/195702437 objects degraded (1.064%)
>             43492/195702437 objects misplaced (0.022%)
>             7360 active+clean
>             868  active+undersized+degraded+remapped+backfill_wait
>             12   active+undersized+degraded+remapped+backfilling
>
>  io:
>    client:   150 MiB/s rd, 642 op/s rd, 0 op/s wr
>    recovery: 626 MiB/s, 223 objects/s
>
> $ ceph versions
> {
>    "mon": {
>        "ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba)
> nautilus (stable)": 3
>    },
>    "mgr": {
>        "ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba)
> nautilus (stable)": 3
>    },
>    "osd": {
>        "ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be)
> nautilus (stable)": 754
>    },
>    "mds": {},
>    "rgw": {
>        "ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba)
> nautilus (stable)": 16
>    },
>    "overall": {
>        "ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be)
> nautilus (stable)": 754,
>        "ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba)
> nautilus (stable)": 22
>    }
> }
>
> I restarted one of the monitors and it dropped out of the list only
> showing 2 blocked ops, but then showed up again a little while later.
>
> Any ideas on where to look?

For posterity's sake, it looks like I got things happy again.

The rgw data pool is 8+2 EC, but was set for min_size=10. I thought I
had configured that min_size=9, but it was recovering PGs, so I didn't
think about it at the time. Then one OSD started crashing with
something about strays and would be restarted and crash again. Then
incomplete PGs showed up. I dropped the min_size to 8 to get things
recovered and marked osd.119 out to empty it off. Once the cluster
recovered and all PGs were healthy, I set min_size=9. I then noticed
that what I thought were rgw instances being blocked where actually
the names of the monitors (the hosts are named after the rgws, but
mon, mgr and rgw are all containers on the boxes). I thought, well let
me try to roll the first monitor again and see if that unblocks the
op, sure enough it looks like it unblocked this time and has not
showed up again in 10 minutes. After letting osd.119 sit empty for
about 10 minutes, I set it back in and it doesn't seem to be crashing
anymore, so I wonder if it had some bad db entry. It's almost halfway
back in and so far so good.

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx