Hi, Here's a similar bug: https://tracker.ceph.com/issues/47361 Back then, upmap would generate mappings that invalidate the crush rule. I don't know if that is still the case, but indeed you'll want to correct your rule. Something else you can do before applying the new crush map is use osdmaptool to compare the PGs placement before and after, something like: osdmaptool --test-map-pgs-dump osdmap.before > before.txt osdmaptool --test-map-pgs-dump osdmap.after > after.txt diff -u before.txt after.txt The above will help you estimate how much data will move after injecting the fixed crush map. So depending on the impact you can schedule the change appropriately. I also recommend to keep a backup of the previous crushmap so that you can quickly restore it if anything goes wrong. Cheers, Dan On Mon, Oct 10, 2022, 19:31 Christopher Durham <caduceus42@xxxxxxx> wrote: > Hello, > I am using pacific 16.2.10 on Rocky 8.6 Linux. > > After setting upmap_max_deviation to 1 on the ceph balancer in ceph-mgr, I > achieved a near perfect balance of PGs and space on my OSDs. This is great. > > However, I started getting the following errors on my ceph-mon logs, every > three minutes, for each of the OSDs that had been mappedby the balancer: > 2022-10-07T17:10:39.619+0000 7f7c2786d700 1 verify_upmap unable to get > parent of osd.497, skipping for now > > After banging my head against the wall for a bit trying to figure this > out, I think I have discovered the issue: > > Currently, I have my pool EC Pool configured with the following crush rule: > > rule mypoolname { > id -5 > type erasure > step take myroot > step choose indep 4 type rack > step choose indep 2 type pod > step chooseleaf indep 1 type host > step emit > } > > Basically, pick 4 racks, then 2 pods in each rack, and then one host in > each pod, For a total of > 8 chunks. (The pool is a is a 6+2). The 4 racks are chosen from the myroot > root entry, which is as follows. > > > root myroot { > id -400 > item rack1 weight N > item rack2 weight N > item rack3 weight N > item rack4 weight N > } > > This has worked fine since inception, over a year ago. And the PGs are all > as I expect with OSDs from the 4 racks and not on the same host or pod. > > The errors above, verify_upmap, started after I had the upmap_ > max_deviation set to 1 in the balancer and having it > move things around, creating pg_upmap entries. > > I then discovered, while trying to figure this out, that the device types > are: > > type 0 osd > type 1 host > type 2 chassis > type 3 rack > ... > type 6 pod > > So pod is HIGHER on the hierarchy than rack. I have it as lower on my > rule. > > What I want to do is remove the pods completely to work around this. > Something like: > > rule mypoolname { > id -5 > type erasure > step take myroot > step choose indep 4 type rack > step chooseleaf indep 2 type host > step emit > } > > This will pick 4 racks and then 2 hosts in each rack. Will this cause any > problems? I can add the pod stuff back later as 'chassis' instead. I can > live without the 'pod' separation if needed. > > To test this, I tried doing something like this: > > 1. grab the osdmap: > ceph osd getmap -o /tmp/om > 2. pull out the crushmap: > osdmaptool --export-crush /tmp/crush.bin > 3. cnvert it to text: > crushtool -d /tmp/crush.bin -o /tmp/crush.txt > > I then edited the rule for this pool as above, to remove the pod and go > directly > to pulling from 4 racks then 2 hosts in each rack. I then compiled up the > crush map > and then imported it into the extracted osdmap: > > crushtool -c /tmp/crush.txt -o /tmp/crush.bin > osdmaptool /tmp/om --import-crush /tmp/crush.bin > > I then ran upmap-cleanup on the new osdmap: > > osdmaptool /tmp/om --upmap-cleanup > > I did NOT get any of the verify_upmap messages (but it did generate some > rm-pg-upmap-items and some new upmaps in the list of commands to execute). > > When I did the extraction of the osdmap WITHOUT any changes to it, and > then ran the upmap-cleanup, I got the same verify_upmap errors I am now > seeing in the ceph-mon logs. > > So, should I just change the crushmap to remove the wrong rack->pod->host > hierarchy, making it rack->host ? > Will I have other issues? I am surprised that crush allowed me to create > this out of order rule to begin with. > > Thanks for any suggestions. > > -Chris > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx