Re: add and start OSD without rebalancing

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Wed, 24 Mar 2021 19:32:55 +0100

Not sure why, without looking at your crush map in detail.

But to be honest, I don't think you need such a tool anymore. It was
written back in the filestore days when backfilling could be much more
disruptive than today.

You have only ~10 osds to fill up: just mark them fully in, or increment by
a few steps manually.

.. dan

On Wed, Mar 24, 2021, 6:24 PM Boris Behrens <bb@xxxxxxxxx> wrote:

> I might be stupid, but do I do something wrong with the script?
> [root@mon1 ceph-scripts]# ./tools/ceph-gentle-reweight -o
> 43,44,45,46,47,48,49,50,51,52,53,54,55 -s 00:00 -e 23:59 -b 82 -p rbd -t
> 1.74660
> Draining OSDs:  ['43', '44', '45', '46', '47', '48', '49', '50', '51',
> '52', '53', '54', '55']
> Max latency (ms):  20
> Max PGs backfilling:  82
> Delta weight: 0.01
> Target weight: 1.7466
> Latency test pool: rbd
> Run interval: 60
> Start time: 00:00:00
> End time: 23:59:00
> Allowed days: []
> update_osd_tree: loading ceph osd tree
> update_osd_tree: done
> reweight_osds: changing all osds by weight 0.01 (target 1.7466)
> check current time: 18:18:59
> check current day: 2
> get_num_backfilling: PGs currently backfilling: 75
> measure_latency: measuring 4kB write latency
> measure_latency: current latency is 5.50958
> Traceback (most recent call last):
>   File "./tools/ceph-gentle-reweight", line 191, in <module>
>     main(sys.argv[1:])
>   File "./tools/ceph-gentle-reweight", line 186, in main
>     reweight_osds(drain_osds, max_pgs_backfilling, max_latency,
> delta_weight, target_weight, test_pool, start_time, end_time, allowed_days,
> interval, really)
>   File "./tools/ceph-gentle-reweight", line 98, in reweight_osds
>     weight = get_crush_weight(osd)
>   File "./tools/ceph-gentle-reweight", line 25, in get_crush_weight
>     raise Exception('Undefined crush_weight for %s' % osd)
> Exception: Undefined crush_weight for 43
>
>
> I already tried only a single osd and leaving the -t option out.
>
> Am Mi., 24. März 2021 um 16:31 Uhr schrieb Janne Johansson <
> icepic.dz@xxxxxxxxx>:
>
>> Den ons 24 mars 2021 kl 14:55 skrev Boris Behrens <bb@xxxxxxxxx>:
>> >
>> > Oh cool. Thanks :)
>> >
>> > How do I find the correct weight after it is added?
>> > For the current process I just check the other OSDs but this might be a
>> question that someone will raise.
>> >
>> > I could imagine that I need to adjust the ceph-gentle-reweight's target
>> weight to the correct one.
>>
>> I look at "ceph osd df tree" for the size,
>>
>> [...]
>> 287   hdd   11.00000  1.00000  11 TiB  81 GiB  80 GiB 1.3 MiB  1.7 GiB
>>  11 TiB 0.73 1.03 117         osd.287
>> 295   ssd    3.64000  1.00000 3.6 TiB 9.9 GiB  87 MiB 2.0 GiB  7.9 GiB
>> 3.6 TiB 0.27 0.38  71         osd.295
>>
>> the 11.0000 should somewhat match the 11TB detected size of the hdd,
>> as crush weight 3.64 is matching the 3.6TB size of the ssd.
>>
>> So when you add with lowered weight, you need to check what size you
>> have on the added drive(s). From there we have small scripts that take
>> a lot of newly added drives and raise the crush weight of them at the
>> same time (with norebalance before changing them, and unset'ing it
>> after all drives have gotten slightly bigger crush weight) to allow
>> for parallelism, while not going too wild on the amount of changes per
>> round (so the cluster can be HEALTH_OK for a moment in between each
>> step).
>>
>> --
>> May the most significant bit of your life be positive.
>>
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groÃƒ¼en Saal.
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx