Re: [ceph-users] How to test PG mapping with reweight

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I finally tracked down David Turner's offline reweight script [0] which in theory does what I'm looking for, but he uses `crushtool -i <crush_map> --reweight-item osd.<OSD_NUM> <WEIGHT>` which I confirmed changes the CRUSH weight and not the reweight which is what I want.

When doing a `osdmp -i <osdmap> --dump=json`, I see weight in the osd objects that corresponds with the reweight and not the CRUSH weight (CRUSH weight is nowhere to be found in the dump). That seems to correlate with [1]:

    f->dump_float("weight", get_weightf(id));

where get_weightf() is [2]:

    float get_weightf(int o) const {
        return (float)get_weight(o) / (float)CEPH_OSD_IN;
    }

and it looks like a property of the osdmap. That would mean that crushtool would not be able to adjust reweight since it is not a CRUSH attribute, right?

Looking at the code in osdmaptool, I don't see a way to set the reweight for the osds.

Can someone point to the order of structs in the osdmap?

I could then read the binary osdmap in Python and make changes to the binary map and then call osdmaptool to check the PG mappings. That sounds better than trying to extend osdmaptool for Jewel.

[0] http://lists.ceph.com/pipermail/ceph-large-ceph.com/attachments/20170110/11d6382f/attachment.py
[1] https://github.com/ceph/ceph/blob/cdb8df13cb7c7242f95ebbf4e404ec0006504a9f/src/osd/OSDMap.cc#L3403
[2] https://github.com/ceph/ceph/blob/f7376c0754ec700a0063bf14d0293cf31ad77798/src/osd/OSDMap.h#L768-L770
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Sat, Sep 7, 2019 at 11:10 AM Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
I've done some more digging into this. crushtool doesn't seem to apply reweights as the CRUSH map doesn't have the reweights in it. If I run `ceph osd getmap -o osdmap.o`, then I can run `osdmaptool --print osdmap.o` and I see all the osds with their reweights and if they are in the cluster or down. I can then run `osdmaptool --test-map-pg 5.1450 osdmap.o` and it gives me the OSDs that it would map to. This is exactly what I'm looking for. The only component missing is being able to modify the osdmap offline (since it is binary). There is an `--export-crush` and `--import-crush`, but that only pulls out the CRUSH map (which I can decompile fine), but again doesn't have the reweights or the up/down status.

Any ideas how to modify the binary crushmap offline? It would be great to be able to just insert the final osdmap into the cluster rather than iterating through all 700+ osds reweighting them.

Thanks,
Robert LeBlanc
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Mon, Sep 2, 2019 at 10:29 AM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
crushtool can simulate mappings for you, it's help/man page should
explain everything.

Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Mon, Sep 2, 2019 at 7:08 PM Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
>
> I'd like to test how reweighting an OSD will change how the PGs map in the cluster.
>
> I suspect that I'd dump the CRUSH map and PGs in the cluster that I'm interested in then use osdmaptool. I'm not understanding how to use osdmaptool to set the reweight, then query a PG or the entire set of PGs that I'm interested in. I then suspect that if I'm okay with the new map that I could inject it into the cluster instead of having to run reweight on the OSD(s).
>
> This is a Jewel cluster and I'm trying to calculate OSD usage offline, then inject a map that is more distributed instead of doing a reweight, move the PGs which take a long time to just rinse and repeat over and over again.
>
> Thanks,
> Robert LeBlanc
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx

[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux