Re: add and start OSD without rebalancing

Boris Behrens <bb@xxxxxxxxx> · Wed, 24 Mar 2021 18:24:08 +0100

I might be stupid, but do I do something wrong with the script?
[root@mon1 ceph-scripts]# ./tools/ceph-gentle-reweight -o
43,44,45,46,47,48,49,50,51,52,53,54,55 -s 00:00 -e 23:59 -b 82 -p rbd -t
1.74660
Draining OSDs:  ['43', '44', '45', '46', '47', '48', '49', '50', '51',
'52', '53', '54', '55']
Max latency (ms):  20
Max PGs backfilling:  82
Delta weight: 0.01
Target weight: 1.7466
Latency test pool: rbd
Run interval: 60
Start time: 00:00:00
End time: 23:59:00
Allowed days: []
update_osd_tree: loading ceph osd tree
update_osd_tree: done
reweight_osds: changing all osds by weight 0.01 (target 1.7466)
check current time: 18:18:59
check current day: 2
get_num_backfilling: PGs currently backfilling: 75
measure_latency: measuring 4kB write latency
measure_latency: current latency is 5.50958
Traceback (most recent call last):
  File "./tools/ceph-gentle-reweight", line 191, in <module>
    main(sys.argv[1:])
  File "./tools/ceph-gentle-reweight", line 186, in main
    reweight_osds(drain_osds, max_pgs_backfilling, max_latency,
delta_weight, target_weight, test_pool, start_time, end_time, allowed_days,
interval, really)
  File "./tools/ceph-gentle-reweight", line 98, in reweight_osds
    weight = get_crush_weight(osd)
  File "./tools/ceph-gentle-reweight", line 25, in get_crush_weight
    raise Exception('Undefined crush_weight for %s' % osd)
Exception: Undefined crush_weight for 43

I already tried only a single osd and leaving the -t option out.

Am Mi., 24. März 2021 um 16:31 Uhr schrieb Janne Johansson <
icepic.dz@xxxxxxxxx>:

> Den ons 24 mars 2021 kl 14:55 skrev Boris Behrens <bb@xxxxxxxxx>:
> >
> > Oh cool. Thanks :)
> >
> > How do I find the correct weight after it is added?
> > For the current process I just check the other OSDs but this might be a
> question that someone will raise.
> >
> > I could imagine that I need to adjust the ceph-gentle-reweight's target
> weight to the correct one.
>
> I look at "ceph osd df tree" for the size,
>
> [...]
> 287   hdd   11.00000  1.00000  11 TiB  81 GiB  80 GiB 1.3 MiB  1.7 GiB
>  11 TiB 0.73 1.03 117         osd.287
> 295   ssd    3.64000  1.00000 3.6 TiB 9.9 GiB  87 MiB 2.0 GiB  7.9 GiB
> 3.6 TiB 0.27 0.38  71         osd.295
>
> the 11.0000 should somewhat match the 11TB detected size of the hdd,
> as crush weight 3.64 is matching the 3.6TB size of the ssd.
>
> So when you add with lowered weight, you need to check what size you
> have on the added drive(s). From there we have small scripts that take
> a lot of newly added drives and raise the crush weight of them at the
> same time (with norebalance before changing them, and unset'ing it
> after all drives have gotten slightly bigger crush weight) to allow
> for parallelism, while not going too wild on the amount of changes per
> round (so the cluster can be HEALTH_OK for a moment in between each
> step).
>
> --
> May the most significant bit of your life be positive.
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groÃƒ¼en Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx