Re: Very slow backfilling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ok, Thanks. You mean that the autoscale feature is... stupid ?
I'm going to change pgp_num and use the legacy formula OSDs * 100 / pool
size.


Le jeu. 2 mars 2023 à 17:04, Curt <lightspd@xxxxxxxxx> a écrit :

> I see autoscale_mode on all pools and I'm guessing this is your largest
> pool bkp365-ncy.rgw.buckets.data, with 32 pg. I would definitely turn off
> autoscale and increase pg_num/pgp_num. Someone with more experience than I
> can chime in, but I would think something like 2048 would be much better.
>
> On Thu, Mar 2, 2023 at 6:12 PM Joffrey <joff.au@xxxxxxxxx> wrote:
>
>> root@hbgt-ceph1-mon3:/# ceph osd df
>> ID  CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE   DATA     OMAP     META
>>    AVAIL    %USE   VAR   PGS  STATUS
>>  1    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB   11 KiB   23
>> GiB   11 TiB  36.17  1.39   17      up
>>  3    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB  3.7 GiB   17
>> GiB   12 TiB  28.47  1.09   11      up
>>  5    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB  3.2 GiB   12
>> GiB   14 TiB  20.89  0.80   13      up
>>  7    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  3.2 GiB  6.9
>> GiB   15 TiB  13.32  0.51   19      up
>>  9    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB   68 MiB   18
>> GiB   12 TiB  28.53  1.09   18      up
>> 11    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB  403 MiB   23
>> GiB   11 TiB  36.13  1.38   17      up
>> 13    hdd  17.34140   1.00000   17 TiB  1001 GiB  7.1 GiB  9.9 MiB  1.1
>> GiB   16 TiB   5.64  0.22   18      up
>> 15    hdd  17.34140   1.00000   17 TiB   8.9 TiB  7.9 TiB  842 KiB   34
>> GiB  8.4 TiB  51.41  1.97   18      up
>> 17    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   24 KiB   12
>> GiB   14 TiB  20.90  0.80   17      up
>> 19    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  4.1 GiB  6.2
>> GiB   15 TiB  13.31  0.51   18      up
>> 21    hdd  17.34140   1.00000   17 TiB   5.0 TiB  4.0 TiB  206 MiB   17
>> GiB   12 TiB  28.55  1.09   23      up
>> 23    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB  4.2 GiB   17
>> GiB   12 TiB  28.54  1.09   14      up
>>  0    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB  7.2 GiB   12
>> GiB   14 TiB  20.94  0.80   18      up
>>  2    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   18 KiB   12
>> GiB   14 TiB  20.93  0.80   13      up
>>  4    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB  3.0 GiB   12
>> GiB   14 TiB  20.95  0.80   20      up
>>  6    hdd  17.34140   1.00000   17 TiB   8.9 TiB  7.9 TiB  4.4 MiB   34
>> GiB  8.4 TiB  51.36  1.97   17      up
>>  8    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  965 KiB  6.5
>> GiB   15 TiB  13.26  0.51   14      up
>> 10    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB   18 KiB  6.5
>> GiB   15 TiB  13.25  0.51   13      up
>> 12    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB   98 MiB   17
>> GiB   12 TiB  28.49  1.09   16      up
>> 14    hdd  17.34140   1.00000   17 TiB   5.0 TiB  4.0 TiB  4.2 GiB   17
>> GiB   12 TiB  28.55  1.09   20      up
>> 16    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   24 KiB   12
>> GiB   14 TiB  20.94  0.80   20      up
>> 18    hdd  17.34140   1.00000   17 TiB   8.9 TiB  7.9 TiB   17 MiB   34
>> GiB  8.4 TiB  51.42  1.97   19      up
>> 20    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB  3.2 GiB   17
>> GiB   12 TiB  28.50  1.09   18      up
>> 22    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  2.7 GiB  6.2
>> GiB   15 TiB  13.25  0.51   11      up
>> 24    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB   70 MiB   17
>> GiB   12 TiB  28.50  1.09   18      up
>> 25    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB  3.0 GiB   17
>> GiB   12 TiB  28.51  1.09   16      up
>> 26    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB  3.0 GiB   23
>> GiB   11 TiB  36.13  1.38   15      up
>> 27    hdd  17.34140   1.00000   17 TiB   5.0 TiB  4.0 TiB  205 MiB   17
>> GiB   12 TiB  28.59  1.10   16      up
>> 28    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  1.0 MiB  6.3
>> GiB   15 TiB  13.27  0.51   12      up
>> 29    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB  1.3 MiB   17
>> GiB   12 TiB  28.50  1.09    4      up
>> 30    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB  379 KiB   23
>> GiB   11 TiB  36.14  1.38   16      up
>> 31    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB  2.5 MiB   12
>> GiB   14 TiB  20.92  0.80   19      up
>> 32    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   11 MiB   12
>> GiB   14 TiB  20.93  0.80   16      up
>> 33    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   18 KiB   12
>> GiB   14 TiB  20.91  0.80   17      up
>> 34    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB   71 MiB   23
>> GiB   11 TiB  36.15  1.38   19      up
>> 35    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  3.3 GiB  6.3
>> GiB   15 TiB  13.28  0.51   14      up
>> 36    hdd  17.34140   1.00000   17 TiB   5.0 TiB  4.0 TiB      0 B   17
>> GiB   12 TiB  28.59  1.09   13      up
>> 37    hdd  17.34140   1.00000   17 TiB   4.9 TiB  4.0 TiB   69 MiB   17
>> GiB   12 TiB  28.54  1.09   12      up
>> 38    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  2.9 GiB  6.7
>> GiB   15 TiB  13.26  0.51   22      up
>> 39    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB  205 MiB   23
>> GiB   11 TiB  36.19  1.39   25      up
>> 40    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB    9 KiB   12
>> GiB   14 TiB  20.88  0.80   14      up
>> 41    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB  8.2 GiB   23
>> GiB   11 TiB  36.11  1.38   20      up
>> 42    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   55 KiB   12
>> GiB   14 TiB  20.91  0.80   16      up
>> 43    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB   70 MiB   23
>> GiB   11 TiB  36.17  1.39   21      up
>> 44    hdd  17.34140   1.00000   17 TiB   7.6 TiB  6.6 TiB   18 KiB   28
>> GiB  9.8 TiB  43.75  1.68   16      up
>> 45    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  141 MiB  6.5
>> GiB   15 TiB  13.29  0.51   17      up
>> 46    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  1.7 MiB  6.4
>> GiB   15 TiB  13.27  0.51   15      up
>> 47    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB  3.5 GiB   11
>> GiB   14 TiB  20.89  0.80   22      up
>> 48    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB    9 KiB  6.3
>> GiB   15 TiB  13.25  0.51   10      up
>> 49    hdd  17.34140   1.00000   17 TiB   8.9 TiB  7.9 TiB    4 KiB   33
>> GiB  8.4 TiB  51.41  1.97   18      up
>> 50    hdd  17.34140   1.00000   17 TiB   7.6 TiB  6.6 TiB  212 MiB   31
>> GiB  9.7 TiB  43.81  1.68   20      up
>> 51    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.6 TiB   85 MiB   13
>> GiB   14 TiB  20.87  0.80   19      up
>> 52    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  5.4 GiB  6.0
>> GiB   15 TiB  13.34  0.51   18      up
>> 53    hdd  17.34140   1.00000   17 TiB   5.0 TiB  4.0 TiB   25 MiB   19
>> GiB   12 TiB  28.55  1.09   16      up
>> 54    hdd  17.34140   1.00000   17 TiB   6.2 TiB  5.3 TiB  198 MiB   23
>> GiB   11 TiB  35.99  1.38   14      up
>> 55    hdd  17.34140   1.00000   17 TiB   5.0 TiB  4.0 TiB   10 GiB   18
>> GiB   12 TiB  28.59  1.09   26      up
>> 56    hdd  17.34140   1.00000   17 TiB   6.3 TiB  5.3 TiB  153 MiB   24
>> GiB   11 TiB  36.14  1.38   22      up
>> 57    hdd  17.34140   1.00000   17 TiB   3.6 TiB  2.7 TiB   58 KiB   12
>> GiB   14 TiB  20.91  0.80   13      up
>> 58    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB  3.3 GiB  6.4
>> GiB   15 TiB  13.23  0.51   11      up
>> 59    hdd  17.34140   1.00000   17 TiB   2.3 TiB  1.3 TiB   19 KiB  6.3
>> GiB   15 TiB  13.27  0.51   11      up
>>                         TOTAL  1.0 PiB   272 TiB  213 TiB   84 GiB  942
>> GiB  769 TiB  26.11
>>
>>
>> root@hbgt-ceph1-mon3:/# ceph osd dump | grep pool
>> pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash
>> rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 15503 lfor
>> 0/8533/8531 flags hashpspool stripe_width 0 pg_num_min 1 application
>> mgr,mgr_devicehealth
>> pool 2 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
>> rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 8321 lfor
>> 0/8321/8319 flags hashpspool stripe_width 0 application rgw
>> pool 3 'bkp365-ncy.rgw.log' replicated size 3 min_size 2 crush_rule 0
>> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
>> 8297 lfor 0/8297/8295 flags hashpspool stripe_width 0 application rgw
>> pool 4 'bkp365-ncy.rgw.control' replicated size 3 min_size 2 crush_rule 0
>> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
>> 8054 lfor 0/8054/8052 flags hashpspool stripe_width 0 application rgw
>> pool 5 'bkp365-ncy.rgw.meta' replicated size 3 min_size 2 crush_rule 0
>> object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 3412
>> lfor 0/3412/3410 flags hashpspool stripe_width 0 pg_autoscale_bias 4
>> pg_num_min 8 application rgw
>> pool 6 'bkp365-ncy.rgw.buckets.data' erasure profile EC32 size 5 min_size
>> 4 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on
>> last_change 3500 lfor 0/0/2720 flags hashpspool stripe_width 12288
>> application rgw
>> pool 7 'bkp365-ncy.rgw.buckets.index' replicated size 3 min_size 2
>> crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on
>> last_change 3436 lfor 0/3436/3434 flags hashpspool stripe_width 0
>> pg_autoscale_bias 4 pg_num_min 8 application rgw
>> pool 9 'ncy.rgw.buckets.data' erasure profile EC32 size 5 min_size 4
>> crush_rule 3 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on
>> last_change 14975 lfor 0/0/14973 flags hashpspool stripe_width 12288
>> application rgw
>> pool 10 'ncy.rgw.log' replicated size 3 min_size 2 crush_rule 0
>> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
>> 14979 flags hashpspool stripe_width 0 application rgw
>> pool 11 'ncy.rgw.control' replicated size 3 min_size 2 crush_rule 0
>> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
>> 14981 flags hashpspool stripe_width 0 application rgw
>> pool 12 'ncy.rgw.meta' replicated size 3 min_size 2 crush_rule 0
>> object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 15105
>> lfor 0/15105/15103 flags hashpspool stripe_width 0 pg_autoscale_bias 4
>> pg_num_min 8 application rgw
>> pool 13 'ncy.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0
>> object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 15236
>> lfor 0/15236/15234 flags hashpspool stripe_width 0 pg_autoscale_bias 4
>> pg_num_min 8 application rgw
>> pool 14 'ncy.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule
>> 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
>> 15241 flags hashpspool stripe_width 0 application rgw
>>
>> (EC32 is a erasure coding with 3 datas and 2 codes)
>>
>> No output with "ceph osd pool autoscale-status"
>>
>> Le jeu. 2 mars 2023 à 15:02, Curt <lightspd@xxxxxxxxx> a écrit :
>>
>>> Forgot to do a reply all.
>>>
>>> What does
>>>
>>> ceph osd df
>>> ceph osd dump | grep pool return?
>>>
>>> Are you using auto scaling? 289pg with 272tb of data and 60 osds, that
>>> seems like 3-4 pg per osd at almost 1TB each. Unless I'm thinking of this
>>> wrong.
>>>
>>> On Thu, Mar 2, 2023, 17:37 Joffrey <joff.au@xxxxxxxxx> wrote:
>>>
>>>> My Ceph Version is 17.2.5 and all configuration about osd_scrub* are
>>>> defaults. I tried some updates on osd-max-backfills but no change.
>>>> I have many HDD with NVME for db and all are connected in a 25G network.
>>>>
>>>> Yes, it's the same PG since 4 days.
>>>>
>>>> I got a failure on a HDD and get many days of recovery+backfilling
>>>> last  2
>>>> weeks.   Perhaps the 'not in time' warning is related to this.
>>>>
>>>> 'Jof
>>>>
>>>> Le jeu. 2 mars 2023 à 14:25, Anthony D'Atri <aad@xxxxxxxxxxxxxx> a
>>>> écrit :
>>>>
>>>> > Run `ceph health detail`.
>>>> >
>>>> > Is it the same PG backfilling for a long time, or a different one over
>>>> > time?
>>>> >
>>>> > That it’s remapped makes me think that what you’re seeing is the
>>>> balancer
>>>> > doing its job.
>>>> >
>>>> > As far as the scrubbing, do you limit the times when scrubbing can
>>>> happen?
>>>> > Are these HDDs? EC?
>>>> >
>>>> > > On Mar 2, 2023, at 07:20, Joffrey <joff.au@xxxxxxxxx> wrote:
>>>> > >
>>>> > > Hi,
>>>> > >
>>>> > > I have many 'not {deep-}scrubbed in time' and a1 PG
>>>> remapped+backfilling
>>>> > > and I don't understand why this backfilling is taking so long.
>>>> > >
>>>> > > root@hbgt-ceph1-mon3:/# ceph -s
>>>> > >  cluster:
>>>> > >    id:     c300532c-51fa-11ec-9a41-0050569c3b55
>>>> > >    health: HEALTH_WARN
>>>> > >            15 pgs not deep-scrubbed in time
>>>> > >            13 pgs not scrubbed in time
>>>> > >
>>>> > >  services:
>>>> > >    mon: 3 daemons, quorum
>>>> hbgt-ceph1-mon1,hbgt-ceph1-mon2,hbgt-ceph1-mon3
>>>> > > (age 36h)
>>>> > >    mgr: hbgt-ceph1-mon2.nteihj(active, since 2d), standbys:
>>>> > > hbgt-ceph1-mon1.thrnnu, hbgt-ceph1-mon3.gmfzqm
>>>> > >    osd: 60 osds: 60 up (since 13h), 60 in (since 13h); 1 remapped
>>>> pgs
>>>> > >    rgw: 3 daemons active (3 hosts, 2 zones)
>>>> > >
>>>> > >  data:
>>>> > >    pools:   13 pools, 289 pgs
>>>> > >    objects: 67.74M objects, 127 TiB
>>>> > >    usage:   272 TiB used, 769 TiB / 1.0 PiB avail
>>>> > >    pgs:     288 active+clean
>>>> > >             1   active+remapped+backfilling
>>>> > >
>>>> > >  io:
>>>> > >    client:   3.3 KiB/s rd, 1.5 MiB/s wr, 3 op/s rd, 8 op/s wr
>>>> > >    recovery: 790 KiB/s, 0 objects/s
>>>> > >
>>>> > >
>>>> > > What can I do to understand this slow recovery (is it the backfill
>>>> > action ?)
>>>> > >
>>>> > > Thanks you
>>>> > >
>>>> > > 'Jof
>>>> > > _______________________________________________
>>>> > > ceph-users mailing list -- ceph-users@xxxxxxx
>>>> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>> >
>>>> >
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>>
>>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux