Re: Ceph recovery network speed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 25, 2022 at 3:27 AM Anthony D'Atri <anthony.datri@xxxxxxxxx>
wrote:

> The pg_autoscaler aims IMHO way too low and I advise turning it off.
>
>
>
> > On Jun 24, 2022, at 11:11 AM, Curt <lightspd@xxxxxxxxx> wrote:
> >
> >> You wrote 2TB before, are they 2TB or 18TB?  Is that 273 PGs total or
> per
> > osd?
> > Sorry, 18TB of data and 273 PGs total.
> >
> >> `ceph osd df` will show you toward the right how many PGs are on each
> > OSD.  If you have multiple pools, some PGs will have more data than
> others.
> >> So take an average # of PGs per OSD and divide the actual HDD capacity
> > by that.
> > 20 pg on avg / 2TB(technically 1.8 I guess) which would be 10.
>
> I’m confused.  Is 20 what `ceph osd df` is reporting?  Send me the output
> of

Yes, 20 would be the avg pg count.
 ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA      OMAP     META
  AVAIL    %USE   VAR   PGS  STATUS
 1    hdd  1.81940   1.00000  1.8 TiB  748 GiB   746 GiB  207 KiB  1.7 GiB
 1.1 TiB  40.16  1.68   21      up
 3    hdd  1.81940   1.00000  1.8 TiB  459 GiB   457 GiB    3 KiB  1.2 GiB
 1.4 TiB  24.61  1.03   20      up
 5    hdd  1.81940   1.00000  1.8 TiB  153 GiB   152 GiB   32 KiB  472 MiB
 1.7 TiB   8.20  0.34   15      up
 7    hdd  1.81940   1.00000  1.8 TiB  471 GiB   470 GiB   83 KiB  1.0 GiB
 1.4 TiB  25.27  1.06   24      up
 9    hdd  1.81940   1.00000  1.8 TiB  1.0 TiB  1022 GiB  136 KiB  2.4 GiB
 838 GiB  54.99  2.30   19      up
11    hdd  1.81940   1.00000  1.8 TiB  443 GiB   441 GiB    4 KiB  1.1 GiB
 1.4 TiB  23.76  0.99   20      up
13    hdd  1.81940   1.00000  1.8 TiB  438 GiB   437 GiB  310 KiB  1.0 GiB
 1.4 TiB  23.50  0.98   18      up
15    hdd  1.81940   1.00000  1.8 TiB  334 GiB   333 GiB  621 KiB  929 MiB
 1.5 TiB  17.92  0.75   15      up
17    hdd  1.81940   1.00000  1.8 TiB  310 GiB   309 GiB    2 KiB  807 MiB
 1.5 TiB  16.64  0.70   20      up
19    hdd  1.81940   1.00000  1.8 TiB  433 GiB   432 GiB    7 KiB  974 MiB
 1.4 TiB  23.23  0.97   25      up
45    hdd  1.81940   1.00000  1.8 TiB  169 GiB   169 GiB    2 KiB  615 MiB
 1.7 TiB   9.09  0.38   18      up
 0    hdd  1.81940   1.00000  1.8 TiB  582 GiB   580 GiB  295 KiB  1.7 GiB
 1.3 TiB  31.24  1.31   21      up
 2    hdd  1.81940   1.00000  1.8 TiB  870 MiB    21 MiB  112 KiB  849 MiB
 1.8 TiB   0.05  0.00   14      up
 4    hdd  1.81940   1.00000  1.8 TiB  326 GiB   325 GiB   14 KiB  947 MiB
 1.5 TiB  17.48  0.73   24      up
 6    hdd  1.81940   1.00000  1.8 TiB  450 GiB   448 GiB    1 KiB  1.4 GiB
 1.4 TiB  24.13  1.01   17      up
 8    hdd  1.81940   1.00000  1.8 TiB  152 GiB   152 GiB  618 KiB  900 MiB
 1.7 TiB   8.18  0.34   20      up
10    hdd  1.81940   1.00000  1.8 TiB  609 GiB   607 GiB    4 KiB  1.7 GiB
 1.2 TiB  32.67  1.37   25      up
12    hdd  1.81940   1.00000  1.8 TiB  333 GiB   332 GiB  175 KiB  1.5 GiB
 1.5 TiB  17.89  0.75   24      up
14    hdd  1.81940   1.00000  1.8 TiB  1.0 TiB   1.0 TiB    1 KiB  2.2 GiB
 834 GiB  55.24  2.31   17      up
16    hdd  1.81940   1.00000  1.8 TiB  168 GiB   167 GiB    4 KiB  1.2 GiB
 1.7 TiB   9.03  0.38   15      up
18    hdd  1.81940   1.00000  1.8 TiB  299 GiB   298 GiB  261 KiB  1.6 GiB
 1.5 TiB  16.07  0.67   15      up
32    hdd  1.81940   1.00000  1.8 TiB  873 GiB   871 GiB   45 KiB  2.3 GiB
 990 GiB  46.88  1.96   18      up
22    hdd  1.81940   1.00000  1.8 TiB  449 GiB   447 GiB  139 KiB  1.6 GiB
 1.4 TiB  24.10  1.01   22      up
23    hdd  1.81940   1.00000  1.8 TiB  299 GiB   298 GiB    5 KiB  1.6 GiB
 1.5 TiB  16.06  0.67   20      up
24    hdd  1.81940   1.00000  1.8 TiB  887 GiB   885 GiB    8 KiB  2.4 GiB
 976 GiB  47.62  1.99   23      up
25    hdd  1.81940   1.00000  1.8 TiB  451 GiB   449 GiB    4 KiB  1.6 GiB
 1.4 TiB  24.20  1.01   17      up
26    hdd  1.81940   1.00000  1.8 TiB  602 GiB   600 GiB  373 KiB  2.0 GiB
 1.2 TiB  32.29  1.35   21      up
27    hdd  1.81940   1.00000  1.8 TiB  152 GiB   151 GiB  1.5 MiB  564 MiB
 1.7 TiB   8.14  0.34   14      up
28    hdd  1.81940   1.00000  1.8 TiB  330 GiB   328 GiB    7 KiB  1.6 GiB
 1.5 TiB  17.70  0.74   12      up
29    hdd  1.81940   1.00000  1.8 TiB  726 GiB   723 GiB    7 KiB  2.1 GiB
 1.1 TiB  38.94  1.63   16      up
30    hdd  1.81940   1.00000  1.8 TiB  596 GiB   594 GiB  173 KiB  2.0 GiB
 1.2 TiB  32.01  1.34   19      up
31    hdd  1.81940   1.00000  1.8 TiB  304 GiB   303 GiB    4 KiB  1.6 GiB
 1.5 TiB  16.34  0.68   20      up
44    hdd  1.81940   1.00000  1.8 TiB  150 GiB   149 GiB      0 B  599 MiB
 1.7 TiB   8.03  0.34   12      up
33    hdd  1.81940   1.00000  1.8 TiB  451 GiB   449 GiB  462 KiB  1.8 GiB
 1.4 TiB  24.22  1.01   19      up
34    hdd  1.81940   1.00000  1.8 TiB  449 GiB   448 GiB    2 KiB  966 MiB
 1.4 TiB  24.12  1.01   21      up
35    hdd  1.81940   1.00000  1.8 TiB  458 GiB   457 GiB    2 KiB  1.5 GiB
 1.4 TiB  24.60  1.03   23      up
36    hdd  1.81940   1.00000  1.8 TiB  872 GiB   870 GiB    3 KiB  2.4 GiB
 991 GiB  46.81  1.96   22      up
37    hdd  1.81940   1.00000  1.8 TiB  443 GiB   441 GiB  136 KiB  1.6 GiB
 1.4 TiB  23.77  0.99   16      up
38    hdd  1.81940   1.00000  1.8 TiB  189 GiB   188 GiB   24 MiB  1.1 GiB
 1.6 TiB  10.15  0.42   26      up
39    hdd  1.81940   1.00000  1.8 TiB  601 GiB   599 GiB  613 KiB  1.9 GiB
 1.2 TiB  32.27  1.35   21      up
40    hdd  1.81940   1.00000  1.8 TiB  314 GiB   312 GiB   13 KiB  1.5 GiB
 1.5 TiB  16.84  0.70   17      up
41    hdd  1.81940   1.00000  1.8 TiB  444 GiB   443 GiB  264 KiB  1.4 GiB
 1.4 TiB  23.83  1.00   22      up
42    hdd  1.81940   1.00000  1.8 TiB  449 GiB   447 GiB   37 KiB  1.7 GiB
 1.4 TiB  24.11  1.01   20      up
43    hdd  1.81940   1.00000  1.8 TiB  175 GiB   175 GiB  111 KiB  640 MiB
 1.6 TiB   9.41  0.39   21      up
                       TOTAL   80 TiB   19 TiB    19 TiB   30 MiB   63 GiB
  61 TiB  23.91
MIN/MAX VAR: 0.00/2.31  STDDEV: 12.8

pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 3270 flags
hashpspool stripe_width 0 pg_num_min 1 application mgr,mgr_devicehealth
pool 3 '21BadPool' erasure profile 21profile size 3 min_size 2 crush_rule 2
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
1455 lfor 0/0/1303 flags hashpspool,ec_overwrites,selfmanaged_snaps
stripe_width 8192 application rbd,rgw
pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1310 flags
hashpspool stripe_width 0 application rgw
pool 5 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
1313 flags hashpspool stripe_width 0 application rgw
pool 6 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
1315 flags hashpspool stripe_width 0 application rgw
pool 7 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 1441
lfor 0/1441/1439 flags hashpspool stripe_width 0 pg_autoscale_bias 4
pg_num_min 8 application rgw
pool 8 'rbd_rep_pool' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1452 lfor
0/0/1450 flags hashpspool stripe_width 0 application rbd
pool 9 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule
0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change
1583 lfor 0/1583/1581 flags hashpspool stripe_width 0 pg_autoscale_bias 4
pg_num_min 8 application rgw
pool 10 'default.rgw.buckets.non-ec' replicated size 3 min_size 2
crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on
last_change 1617 flags hashpspool stripe_width 0 application rgw
pool 11 'default.rgw.buckets.data' replicated size 3 min_size 2 crush_rule
0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
1620 flags hashpspool stripe_width 0 application rgw
pool 12 'EC-22-Pool' erasure profile EC-22-Pro size 4 min_size 3 crush_rule
3 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
2223 lfor 0/0/2217 flags hashpspool,ec_overwrites,selfmanaged_snaps
stripe_width 8192 application rbd,rgw

>
>
> ceph osd df
> ceph osd dump | grep pool
>
> >  Shouldn’t that be used though, not capacity? My usage is only 23%
> capacity.  I
> > thought ceph autoscalling pg's changed the size dynamically according to
> > usage?  I'm guessing I'm misunderstanding that part?
>
> The autoscaler doesn’t strictly react to how full your cluster is.
>
> I suspect that you have too few PGs, which would be a contributing factor.
>
> >
> > Thanks,
> > Curt
> >
> > On Fri, Jun 24, 2022 at 9:48 PM Anthony D'Atri <anthony.datri@xxxxxxxxx>
> > wrote:
> >
> >>
> >>> Yes, SATA, I think my benchmark put it around 125, but that was a year
> >> ago, so could be misremembering
> >>
> >> A FIO benchmark, especially a sequential one on an empty drive, can
> >> mislead as to the real-world performance one sees on a fragmented drive.
> >>
> >>> 273 pg at 18TB so each PG would be 60G.
> >>
> >> You wrote 2TB before, are they 2TB or 18TB?  Is that 273 PGs total or
> per
> >> osd?
> >>
> >>> Mainly used for RBD, using erasure coding.  cephadm bootstrap with
> >> docker images.
> >>
> >> Ack.  Have to account for replication.
> >>
> >> `ceph osd df` will show you toward the right how many PGs are on each
> >> OSD.  If you have multiple pools, some PGs will have more data than
> others.
> >>
> >> So take an average # of PGs per OSD and divide the actual HDD capacity
> by
> >> that.
> >>
> >>
> >>
> >>
> >>>
> >>> On Fri, Jun 24, 2022 at 9:21 PM Anthony D'Atri <
> anthony.datri@xxxxxxxxx>
> >> wrote:
> >>>
> >>>
> >>>>
> >>>> 2 PG's shouldn't take hours to backfill in my opinion.  Just 2TB
> >> enterprise HD's.
> >>>
> >>> SATA? Figure they can write at 70 MB/s
> >>>
> >>> How big are your PGs?  What is your cluster used for?  RBD? RGW?
> CephFS?
> >>>
> >>>>
> >>>> Take this log entry below, 72 minutes and still backfilling
> >> undersized?  Should it be that slow?
> >>>>
> >>>> pg 12.15 is stuck undersized for 72m, current state
> >> active+undersized+degraded+remapped+backfilling, last acting
> [34,10,29,NONE]
> >>>>
> >>>> Thanks,
> >>>> Curt
> >>>>
> >>>>
> >>>> On Fri, Jun 24, 2022 at 8:53 PM Anthony D'Atri <
> >> anthony.datri@xxxxxxxxx> wrote:
> >>>> Your recovery is slow *because* there are only 2 PGs backfilling.
> >>>>
> >>>> What kind of OSD media are you using?
> >>>>
> >>>>> On Jun 24, 2022, at 09:46, Curt <lightspd@xxxxxxxxx> wrote:
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> I'm trying to understand why my recovery is so slow with only 2 pg
> >>>>> backfilling.  I'm only getting speeds of 3-4/MiB/s on a 10G
> >> network.  I
> >>>>> have tested the speed between machines with a few tools and all
> >> confirm 10G
> >>>>> speed.  I've tried changing various settings of priority and
> >> recovery sleep
> >>>>> hdd, but still the same. Is this a configuration issue or something
> >> else?
> >>>>>
> >>>>> It's just a small cluster right now with 4 hosts, 11 osd's per.
> >> Please let
> >>>>> me know if you need more information.
> >>>>>
> >>>>> Thanks,
> >>>>> Curt
> >>>>> _______________________________________________
> >>>>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>>>
> >>>
> >>
> >>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux