Re: active+remapped+backfilling keeps going .. and going

"Kyriazis, George" <george.kyriazis@xxxxxxxxx> · Fri, 24 Apr 2020 19:05:44 +0000

pg_autoscalar is on, but the number of PGs are stable.  I’ve seen subsequent calls to “ceph -s” list the same number of total PGs, but PGs remapped+backfilling increased.

I haven’t seen anything in the logs, but perhaps I’m not looking at the right place.  Any place in particular I should be looking?

Thanks!

George

# ceph -s
  cluster:
    id:     ec2c9542-dc1b-4af6-9f21-0adbcabb9452
    health: HEALTH_WARN
            603 pgs not deep-scrubbed in time
            603 pgs not scrubbed in time
            2 daemons have recently crashed

  services:
    mon: 5 daemons, quorum vis-ivb-07,vis-ivb-10,vis-hsw-01,vis-clx-01,vis-clx-05 (age 2h)
    mgr: vis-ivb-07(active, since 2h), standbys: vis-hsw-01, vis-ivb-10, vis-clx-01, vis-clx-05
    mds: cephfs:1 {0=vis-hsw-01=up:active} 2 up:standby
    osd: 15 osds: 15 up (since 2d), 15 in (since 8d); 100 remapped pgs

  data:
    pools:   5 pools, 608 pgs
    objects: 46.32M objects, 49 TiB
    usage:   129 TiB used, 75 TiB / 204 TiB avail
    pgs:     8985854/172482064 objects misplaced (5.210%)
             508 active+clean
             100 active+remapped+backfilling

  io:
    client:   102 KiB/s wr, 0 op/s rd, 4 op/s wr
    recovery: 117 MiB/s, 86 objects/s

# ceph -s
  cluster:
    id:     ec2c9542-dc1b-4af6-9f21-0adbcabb9452
    health: HEALTH_WARN
            603 pgs not deep-scrubbed in time
            603 pgs not scrubbed in time
            2 daemons have recently crashed

  services:
    mon: 5 daemons, quorum vis-ivb-07,vis-ivb-10,vis-hsw-01,vis-clx-01,vis-clx-05 (age 5h)
    mgr: vis-ivb-07(active, since 5h), standbys: vis-hsw-01, vis-ivb-10, vis-clx-01, vis-clx-05
    mds: cephfs:1 {0=vis-hsw-01=up:active} 2 up:standby
    osd: 15 osds: 15 up (since 2d), 15 in (since 8d); 103 remapped pgs

  data:
    pools:   5 pools, 608 pgs
    objects: 46.32M objects, 49 TiB
    usage:   128 TiB used, 75 TiB / 204 TiB avail
    pgs:     8681394/172482064 objects misplaced (5.033%)
             505 active+clean
             103 active+remapped+backfilling

  io:
    recovery: 70 MiB/s, 54 objects/s

#

On Apr 24, 2020, at 1:52 PM, Eugen Block <eblock@xxxxxx<mailto:eblock@xxxxxx>> wrote:

Yes, that means it's off. Can you see anything in the logs? They should show that something triggers the rebalancing. Could it be the pg_autoscaler? Is that enabled?

Zitat von "Kyriazis, George" <george.kyriazis@xxxxxxxxx<mailto:george.kyriazis@xxxxxxxxx>>:

Here is the status of my balancer:

# ceph balancer status
{
   "last_optimize_duration": "",
   "plans": [],
   "mode": "none",
   "active": false,
   "optimize_result": "",
   "last_optimize_started": ""
}
#

Doesn’t that mean it’s “off”?

Thanks,

George

On Apr 24, 2020, at 1:49 AM, Lomayani S. Laizer <lomlaizer@xxxxxxxxx<mailto:lomlaizer@xxxxxxxxx><mailto:lomlaizer@xxxxxxxxx>> wrote:

I had a similar problem  when upgraded to octopus and the solution is to turn off  autobalancing.

You can try to turn off if enabled

ceph balancer off

On Fri, Apr 24, 2020 at 8:51 AM Eugen Block <eblock@xxxxxx<mailto:eblock@xxxxxx><mailto:eblock@xxxxxx>> wrote:
Hi,
the balancer is probably running, which mode? I changed the mode to
none in our own cluster because it also never finished rebalancing and
we didn’t have a bad pg distribution. Maybe it’s supposed to be like
that, I don’t know.

Regards
Eugen

Zitat von "Kyriazis, George" <george.kyriazis@xxxxxxxxx<mailto:george.kyriazis@xxxxxxxxx><mailto:george.kyriazis@xxxxxxxxx>>:

Hello,

I have a Proxmox ceph cluster with 5 nodes and 3 OSDs each (total 15
OSDs), on a 10G network.

The cluster started small, and I’ve progressively added OSDs over
time.  Problem is…. The cluster never rebalances completely.  There
is always progress on backfilling, but PGs that used to be in
active+clean state jump back into the active+remapped+backfilling
(or active+remapped+backfill_wait) state, to be moved to different
OSDs.

Initially I had a 1G network (recently upgraded to 10G), and I was
holding on the backfill settings (osd_max_backfills and
osd_recovery_sleep_hdd).  I just recently (last few weeks) upgraded
to 10G, with osd_max_backfills = 50 and osd_recovery_sleep_hdd = 0
(only HDDs, no SSDs).  Cluster has been backfilling for months now
with no end in sight.

Is this normal behavior?  Is there any setting that I can look at
that till give me an idea as to why PGs are jumping back into
remapped from clean?

Below is output of “ceph osd tree” and “ceph osd df”:

# ceph osd tree
ID  CLASS WEIGHT    TYPE NAME           STATUS REWEIGHT PRI-AFF
-1       203.72472 root default
-9        40.01666     host vis-hsw-01
 3   hdd  10.91309         osd.3           up  1.00000 1.00000
 6   hdd  14.55179         osd.6           up  1.00000 1.00000
10   hdd  14.55179         osd.10          up  1.00000 1.00000
-13        40.01666     host vis-hsw-02
 0   hdd  10.91309         osd.0           up  1.00000 1.00000
 7   hdd  14.55179         osd.7           up  1.00000 1.00000
11   hdd  14.55179         osd.11          up  1.00000 1.00000
-11        40.01666     host vis-hsw-03
 4   hdd  10.91309         osd.4           up  1.00000 1.00000
 8   hdd  14.55179         osd.8           up  1.00000 1.00000
12   hdd  14.55179         osd.12          up  1.00000 1.00000
-3        40.01666     host vis-hsw-04
 5   hdd  10.91309         osd.5           up  1.00000 1.00000
 9   hdd  14.55179         osd.9           up  1.00000 1.00000
13   hdd  14.55179         osd.13          up  1.00000 1.00000
-15        43.65807     host vis-hsw-05
 1   hdd  14.55269         osd.1           up  1.00000 1.00000
 2   hdd  14.55269         osd.2           up  1.00000 1.00000
14   hdd  14.55269         osd.14          up  1.00000 1.00000
-5               0     host vis-ivb-07
-7               0     host vis-ivb-10
#

# ceph osd df
ID CLASS WEIGHT   REWEIGHT SIZE    RAW USE DATA    OMAP    META
AVAIL   %USE  VAR  PGS STATUS
3   hdd 10.91309  1.00000  11 TiB 8.2 TiB 8.2 TiB 552 MiB  25 GiB
2.7 TiB 75.08 1.19 131     up
6   hdd 14.55179  1.00000  15 TiB 9.1 TiB 9.1 TiB 1.2 GiB  30 GiB
5.5 TiB 62.47 0.99 148     up
10   hdd 14.55179  1.00000  15 TiB 8.1 TiB 8.1 TiB 1.5 GiB  20 GiB
6.4 TiB 55.98 0.89 142     up
0   hdd 10.91309  1.00000  11 TiB 7.5 TiB 7.4 TiB 504 MiB  24 GiB
3.5 TiB 68.34 1.09 120     up
7   hdd 14.55179  1.00000  15 TiB 8.7 TiB 8.7 TiB 1.0 GiB  31 GiB
5.8 TiB 60.07 0.95 144     up
11   hdd 14.55179  1.00000  15 TiB 9.4 TiB 9.3 TiB 819 MiB  20 GiB
5.2 TiB 64.31 1.02 147     up
4   hdd 10.91309  1.00000  11 TiB 7.0 TiB 7.0 TiB 284 MiB  25 GiB
3.9 TiB 64.35 1.02 112     up
8   hdd 14.55179  1.00000  15 TiB 9.3 TiB 9.2 TiB 1.8 GiB  29 GiB
5.3 TiB 63.65 1.01 157     up
12   hdd 14.55179  1.00000  15 TiB 8.6 TiB 8.6 TiB 623 MiB  19 GiB
5.9 TiB 59.14 0.94 136     up
5   hdd 10.91309  1.00000  11 TiB 8.6 TiB 8.6 TiB 542 MiB  29 GiB
2.3 TiB 79.01 1.26 134     up
9   hdd 14.55179  1.00000  15 TiB 8.2 TiB 8.2 TiB 707 MiB  27 GiB
6.3 TiB 56.56 0.90 138     up
13   hdd 14.55179  1.00000  15 TiB 8.7 TiB 8.7 TiB 741 MiB  18 GiB
5.8 TiB 59.85 0.95 134     up
1   hdd 14.55269  1.00000  15 TiB 9.8 TiB 9.8 TiB 1.3 GiB  20 GiB
4.8 TiB 67.18 1.07 158     up
2   hdd 14.55269  1.00000  15 TiB 8.7 TiB 8.7 TiB 936 MiB  18 GiB
5.8 TiB 60.04 0.95 148     up
14   hdd 14.55269  1.00000  15 TiB 8.3 TiB 8.3 TiB 673 MiB  18 GiB
6.3 TiB 56.97 0.90 131     up
                    TOTAL 204 TiB 128 TiB 128 TiB  13 GiB 350 GiB
75 TiB 62.95
MIN/MAX VAR: 0.89/1.26  STDDEV: 6.44
#

Thank you!

George

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx