Re: Very slow backfilling

Curt <lightspd@xxxxxxxxx> · Thu, 2 Mar 2023 20:03:47 +0400

I see autoscale_mode on all pools and I'm guessing this is your largest
pool bkp365-ncy.rgw.buckets.data, with 32 pg. I would definitely turn off
autoscale and increase pg_num/pgp_num. Someone with more experience than I
can chime in, but I would think something like 2048 would be much better.

On Thu, Mar 2, 2023 at 6:12 PM Joffrey <joff.au@xxxxxxxxx> wrote:

> root@hbgt-ceph1-mon3:/# ceph osd df
> ID  CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE   DATA     OMAP     META
>    AVAIL    %USE   VAR   PGS  STATUS
>  1    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB   11 KiB   23
> GiB   11 TiB  36.17  1.39   17      up
>  3    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB  3.7 GiB   17
> GiB   12 TiB  28.47  1.09   11      up
>  5    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB  3.2 GiB   12
> GiB   14 TiB  20.89  0.80   13      up
>  7    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  3.2 GiB  6.9
> GiB   15 TiB  13.32  0.51   19      up
>  9    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB   68 MiB   18
> GiB   12 TiB  28.53  1.09   18      up
> 11    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB  403 MiB   23
> GiB   11 TiB  36.13  1.38   17      up
> 13    hdd  17.34140   1.00000   17 TiB  1001 GiB  7.1 GiB  9.9 MiB  1.1
> GiB   16 TiB   5.64  0.22   18      up
> 15    hdd  17.34140   1.00000   17 TiB   8.9 TiB  7.9 TiB  842 KiB   34
> GiB  8.4 TiB  51.41  1.97   18      up
> 17    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   24 KiB   12
> GiB   14 TiB  20.90  0.80   17      up
> 19    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  4.1 GiB  6.2
> GiB   15 TiB  13.31  0.51   18      up
> 21    hdd  17.34140   1.00000   17 TiB   5.0 TiB  4.0 TiB  206 MiB   17
> GiB   12 TiB  28.55  1.09   23      up
> 23    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB  4.2 GiB   17
> GiB   12 TiB  28.54  1.09   14      up
>  0    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB  7.2 GiB   12
> GiB   14 TiB  20.94  0.80   18      up
>  2    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   18 KiB   12
> GiB   14 TiB  20.93  0.80   13      up
>  4    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB  3.0 GiB   12
> GiB   14 TiB  20.95  0.80   20      up
>  6    hdd  17.34140   1.00000   17 TiB   8.9 TiB  7.9 TiB  4.4 MiB   34
> GiB  8.4 TiB  51.36  1.97   17      up
>  8    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  965 KiB  6.5
> GiB   15 TiB  13.26  0.51   14      up
> 10    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB   18 KiB  6.5
> GiB   15 TiB  13.25  0.51   13      up
> 12    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB   98 MiB   17
> GiB   12 TiB  28.49  1.09   16      up
> 14    hdd  17.34140   1.00000   17 TiB   5.0 TiB  4.0 TiB  4.2 GiB   17
> GiB   12 TiB  28.55  1.09   20      up
> 16    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   24 KiB   12
> GiB   14 TiB  20.94  0.80   20      up
> 18    hdd  17.34140   1.00000   17 TiB   8.9 TiB  7.9 TiB   17 MiB   34
> GiB  8.4 TiB  51.42  1.97   19      up
> 20    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB  3.2 GiB   17
> GiB   12 TiB  28.50  1.09   18      up
> 22    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  2.7 GiB  6.2
> GiB   15 TiB  13.25  0.51   11      up
> 24    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB   70 MiB   17
> GiB   12 TiB  28.50  1.09   18      up
> 25    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB  3.0 GiB   17
> GiB   12 TiB  28.51  1.09   16      up
> 26    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB  3.0 GiB   23
> GiB   11 TiB  36.13  1.38   15      up
> 27    hdd  17.34140   1.00000   17 TiB   5.0 TiB  4.0 TiB  205 MiB   17
> GiB   12 TiB  28.59  1.10   16      up
> 28    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  1.0 MiB  6.3
> GiB   15 TiB  13.27  0.51   12      up
> 29    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB  1.3 MiB   17
> GiB   12 TiB  28.50  1.09    4      up
> 30    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB  379 KiB   23
> GiB   11 TiB  36.14  1.38   16      up
> 31    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB  2.5 MiB   12
> GiB   14 TiB  20.92  0.80   19      up
> 32    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   11 MiB   12
> GiB   14 TiB  20.93  0.80   16      up
> 33    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   18 KiB   12
> GiB   14 TiB  20.91  0.80   17      up
> 34    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB   71 MiB   23
> GiB   11 TiB  36.15  1.38   19      up
> 35    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  3.3 GiB  6.3
> GiB   15 TiB  13.28  0.51   14      up
> 36    hdd  17.34140   1.00000   17 TiB   5.0 TiB  4.0 TiB      0 B   17
> GiB   12 TiB  28.59  1.09   13      up
> 37    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB   69 MiB   17
> GiB   12 TiB  28.54  1.09   12      up
> 38    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  2.9 GiB  6.7
> GiB   15 TiB  13.26  0.51   22      up
> 39    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB  205 MiB   23
> GiB   11 TiB  36.19  1.39   25      up
> 40    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB    9 KiB   12
> GiB   14 TiB  20.88  0.80   14      up
> 41    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB  8.2 GiB   23
> GiB   11 TiB  36.11  1.38   20      up
> 42    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   55 KiB   12
> GiB   14 TiB  20.91  0.80   16      up
> 43    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB   70 MiB   23
> GiB   11 TiB  36.17  1.39   21      up
> 44    hdd  17.34140   1.00000   17 TiB   7.6 TiB  6.6 TiB   18 KiB   28
> GiB  9.8 TiB  43.75  1.68   16      up
> 45    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  141 MiB  6.5
> GiB   15 TiB  13.29  0.51   17      up
> 46    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  1.7 MiB  6.4
> GiB   15 TiB  13.27  0.51   15      up
> 47    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB  3.5 GiB   11
> GiB   14 TiB  20.89  0.80   22      up
> 48    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB    9 KiB  6.3
> GiB   15 TiB  13.25  0.51   10      up
> 49    hdd  17.34140   1.00000   17 TiB   8.9 TiB  7.9 TiB    4 KiB   33
> GiB  8.4 TiB  51.41  1.97   18      up
> 50    hdd  17.34140   1.00000   17 TiB   7.6 TiB  6.6 TiB  212 MiB   31
> GiB  9.7 TiB  43.81  1.68   20      up
> 51    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.6 TiB   85 MiB   13
> GiB   14 TiB  20.87  0.80   19      up
> 52    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  5.4 GiB  6.0
> GiB   15 TiB  13.34  0.51   18      up
> 53    hdd  17.34140   1.00000   17 TiB   5.0 TiB  4.0 TiB   25 MiB   19
> GiB   12 TiB  28.55  1.09   16      up
> 54    hdd  17.34140   1.00000   17 TiB   6.2 TiB  5.3 TiB  198 MiB   23
> GiB   11 TiB  35.99  1.38   14      up
> 55    hdd  17.34140   1.00000   17 TiB   5.0 TiB  4.0 TiB   10 GiB   18
> GiB   12 TiB  28.59  1.09   26      up
> 56    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB  153 MiB   24
> GiB   11 TiB  36.14  1.38   22      up
> 57    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   58 KiB   12
> GiB   14 TiB  20.91  0.80   13      up
> 58    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  3.3 GiB  6.4
> GiB   15 TiB  13.23  0.51   11      up
> 59    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB   19 KiB  6.3
> GiB   15 TiB  13.27  0.51   11      up
>                         TOTAL  1.0 PiB   272 TiB  213 TiB   84 GiB  942
> GiB  769 TiB  26.11
>
>
> root@hbgt-ceph1-mon3:/# ceph osd dump | grep pool
> pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash
> rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 15503 lfor
> 0/8533/8531 flags hashpspool stripe_width 0 pg_num_min 1 application
> mgr,mgr_devicehealth
> pool 2 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
> rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 8321 lfor
> 0/8321/8319 flags hashpspool stripe_width 0 application rgw
> pool 3 'bkp365-ncy.rgw.log' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
> 8297 lfor 0/8297/8295 flags hashpspool stripe_width 0 application rgw
> pool 4 'bkp365-ncy.rgw.control' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
> 8054 lfor 0/8054/8052 flags hashpspool stripe_width 0 application rgw
> pool 5 'bkp365-ncy.rgw.meta' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 3412
> lfor 0/3412/3410 flags hashpspool stripe_width 0 pg_autoscale_bias 4
> pg_num_min 8 application rgw
> pool 6 'bkp365-ncy.rgw.buckets.data' erasure profile EC32 size 5 min_size
> 4 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on
> last_change 3500 lfor 0/0/2720 flags hashpspool stripe_width 12288
> application rgw
> pool 7 'bkp365-ncy.rgw.buckets.index' replicated size 3 min_size 2
> crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on
> last_change 3436 lfor 0/3436/3434 flags hashpspool stripe_width 0
> pg_autoscale_bias 4 pg_num_min 8 application rgw
> pool 9 'ncy.rgw.buckets.data' erasure profile EC32 size 5 min_size 4
> crush_rule 3 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on
> last_change 14975 lfor 0/0/14973 flags hashpspool stripe_width 12288
> application rgw
> pool 10 'ncy.rgw.log' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
> 14979 flags hashpspool stripe_width 0 application rgw
> pool 11 'ncy.rgw.control' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
> 14981 flags hashpspool stripe_width 0 application rgw
> pool 12 'ncy.rgw.meta' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 15105
> lfor 0/15105/15103 flags hashpspool stripe_width 0 pg_autoscale_bias 4
> pg_num_min 8 application rgw
> pool 13 'ncy.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 15236
> lfor 0/15236/15234 flags hashpspool stripe_width 0 pg_autoscale_bias 4
> pg_num_min 8 application rgw
> pool 14 'ncy.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
> 15241 flags hashpspool stripe_width 0 application rgw
>
> (EC32 is a erasure coding with 3 datas and 2 codes)
>
> No output with "ceph osd pool autoscale-status"
>
> Le jeu. 2 mars 2023 à 15:02, Curt <lightspd@xxxxxxxxx> a écrit :
>
>> Forgot to do a reply all.
>>
>> What does
>>
>> ceph osd df
>> ceph osd dump | grep pool return?
>>
>> Are you using auto scaling? 289pg with 272tb of data and 60 osds, that
>> seems like 3-4 pg per osd at almost 1TB each. Unless I'm thinking of this
>> wrong.
>>
>> On Thu, Mar 2, 2023, 17:37 Joffrey <joff.au@xxxxxxxxx> wrote:
>>
>>> My Ceph Version is 17.2.5 and all configuration about osd_scrub* are
>>> defaults. I tried some updates on osd-max-backfills but no change.
>>> I have many HDD with NVME for db and all are connected in a 25G network.
>>>
>>> Yes, it's the same PG since 4 days.
>>>
>>> I got a failure on a HDD and get many days of recovery+backfilling last
>>> 2
>>> weeks.   Perhaps the 'not in time' warning is related to this.
>>>
>>> 'Jof
>>>
>>> Le jeu. 2 mars 2023 à 14:25, Anthony D'Atri <aad@xxxxxxxxxxxxxx> a
>>> écrit :
>>>
>>> > Run `ceph health detail`.
>>> >
>>> > Is it the same PG backfilling for a long time, or a different one over
>>> > time?
>>> >
>>> > That it’s remapped makes me think that what you’re seeing is the
>>> balancer
>>> > doing its job.
>>> >
>>> > As far as the scrubbing, do you limit the times when scrubbing can
>>> happen?
>>> > Are these HDDs? EC?
>>> >
>>> > > On Mar 2, 2023, at 07:20, Joffrey <joff.au@xxxxxxxxx> wrote:
>>> > >
>>> > > Hi,
>>> > >
>>> > > I have many 'not {deep-}scrubbed in time' and a1 PG
>>> remapped+backfilling
>>> > > and I don't understand why this backfilling is taking so long.
>>> > >
>>> > > root@hbgt-ceph1-mon3:/# ceph -s
>>> > >  cluster:
>>> > >    id:     c300532c-51fa-11ec-9a41-0050569c3b55
>>> > >    health: HEALTH_WARN
>>> > >            15 pgs not deep-scrubbed in time
>>> > >            13 pgs not scrubbed in time
>>> > >
>>> > >  services:
>>> > >    mon: 3 daemons, quorum
>>> hbgt-ceph1-mon1,hbgt-ceph1-mon2,hbgt-ceph1-mon3
>>> > > (age 36h)
>>> > >    mgr: hbgt-ceph1-mon2.nteihj(active, since 2d), standbys:
>>> > > hbgt-ceph1-mon1.thrnnu, hbgt-ceph1-mon3.gmfzqm
>>> > >    osd: 60 osds: 60 up (since 13h), 60 in (since 13h); 1 remapped pgs
>>> > >    rgw: 3 daemons active (3 hosts, 2 zones)
>>> > >
>>> > >  data:
>>> > >    pools:   13 pools, 289 pgs
>>> > >    objects: 67.74M objects, 127 TiB
>>> > >    usage:   272 TiB used, 769 TiB / 1.0 PiB avail
>>> > >    pgs:     288 active+clean
>>> > >             1   active+remapped+backfilling
>>> > >
>>> > >  io:
>>> > >    client:   3.3 KiB/s rd, 1.5 MiB/s wr, 3 op/s rd, 8 op/s wr
>>> > >    recovery: 790 KiB/s, 0 objects/s
>>> > >
>>> > >
>>> > > What can I do to understand this slow recovery (is it the backfill
>>> > action ?)
>>> > >
>>> > > Thanks you
>>> > >
>>> > > 'Jof
>>> > > _______________________________________________
>>> > > ceph-users mailing list -- ceph-users@xxxxxxx
>>> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>> >
>>> >
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx