Not sure why, without looking at your crush map in detail. But to be honest, I don't think you need such a tool anymore. It was written back in the filestore days when backfilling could be much more disruptive than today. You have only ~10 osds to fill up: just mark them fully in, or increment by a few steps manually. .. dan On Wed, Mar 24, 2021, 6:24 PM Boris Behrens <bb@xxxxxxxxx> wrote: > I might be stupid, but do I do something wrong with the script? > [root@mon1 ceph-scripts]# ./tools/ceph-gentle-reweight -o > 43,44,45,46,47,48,49,50,51,52,53,54,55 -s 00:00 -e 23:59 -b 82 -p rbd -t > 1.74660 > Draining OSDs: ['43', '44', '45', '46', '47', '48', '49', '50', '51', > '52', '53', '54', '55'] > Max latency (ms): 20 > Max PGs backfilling: 82 > Delta weight: 0.01 > Target weight: 1.7466 > Latency test pool: rbd > Run interval: 60 > Start time: 00:00:00 > End time: 23:59:00 > Allowed days: [] > update_osd_tree: loading ceph osd tree > update_osd_tree: done > reweight_osds: changing all osds by weight 0.01 (target 1.7466) > check current time: 18:18:59 > check current day: 2 > get_num_backfilling: PGs currently backfilling: 75 > measure_latency: measuring 4kB write latency > measure_latency: current latency is 5.50958 > Traceback (most recent call last): > File "./tools/ceph-gentle-reweight", line 191, in <module> > main(sys.argv[1:]) > File "./tools/ceph-gentle-reweight", line 186, in main > reweight_osds(drain_osds, max_pgs_backfilling, max_latency, > delta_weight, target_weight, test_pool, start_time, end_time, allowed_days, > interval, really) > File "./tools/ceph-gentle-reweight", line 98, in reweight_osds > weight = get_crush_weight(osd) > File "./tools/ceph-gentle-reweight", line 25, in get_crush_weight > raise Exception('Undefined crush_weight for %s' % osd) > Exception: Undefined crush_weight for 43 > > > I already tried only a single osd and leaving the -t option out. > > Am Mi., 24. März 2021 um 16:31 Uhr schrieb Janne Johansson < > icepic.dz@xxxxxxxxx>: > >> Den ons 24 mars 2021 kl 14:55 skrev Boris Behrens <bb@xxxxxxxxx>: >> > >> > Oh cool. Thanks :) >> > >> > How do I find the correct weight after it is added? >> > For the current process I just check the other OSDs but this might be a >> question that someone will raise. >> > >> > I could imagine that I need to adjust the ceph-gentle-reweight's target >> weight to the correct one. >> >> I look at "ceph osd df tree" for the size, >> >> [...] >> 287 hdd 11.00000 1.00000 11 TiB 81 GiB 80 GiB 1.3 MiB 1.7 GiB >> 11 TiB 0.73 1.03 117 osd.287 >> 295 ssd 3.64000 1.00000 3.6 TiB 9.9 GiB 87 MiB 2.0 GiB 7.9 GiB >> 3.6 TiB 0.27 0.38 71 osd.295 >> >> the 11.0000 should somewhat match the 11TB detected size of the hdd, >> as crush weight 3.64 is matching the 3.6TB size of the ssd. >> >> So when you add with lowered weight, you need to check what size you >> have on the added drive(s). From there we have small scripts that take >> a lot of newly added drives and raise the crush weight of them at the >> same time (with norebalance before changing them, and unset'ing it >> after all drives have gotten slightly bigger crush weight) to allow >> for parallelism, while not going too wild on the amount of changes per >> round (so the cluster can be HEALTH_OK for a moment in between each >> step). >> >> -- >> May the most significant bit of your life be positive. >> > > > -- > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im > groüen Saal. > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx