Okay, so far I figured out that the value in the Ceph dashboard is gathered from a Metric from Prometheus (*ceph_osd_numpg*). Is there anyone here that knows how this is populated? Op ma 26 okt. 2020 om 12:52 schreef Kristof Coucke <kristof.coucke@xxxxxxxxx >: > Hi Frank, > > We're having a lot of small objects in the cluster... RocksDb has issues > with the compaction causing high disk load... That's why we are performing > manual compaction... > See https://github.com/ceph/ceph/pull/37496 > > Br, > > Kristof > > > Op ma 26 okt. 2020 om 12:14 schreef Frank Schilder <frans@xxxxxx>: > >> Hi Kristof, >> >> I missed that: why do you need to do manual compaction? >> >> Best regards, >> ================= >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 >> >> ________________________________________ >> From: Kristof Coucke <kristof.coucke@xxxxxxxxx> >> Sent: 26 October 2020 11:33:52 >> To: Frank Schilder; a.jazdzewski@xxxxxxxxxxxxxx >> Cc: ceph-users@xxxxxxx >> Subject: Re: Question about expansion existing Ceph cluster >> - adding OSDs >> >> Hi Ansgar, Frank, all, >> >> Thanks for the feedback in the first place. >> >> In the meantime, I've added all the disks and the cluster is rebalancing >> itself... Which will take ages as you've mentioned. Last week after this >> conversation it was around 50% (little bit more), today it's around 44,5%. >> Every day, I have to take the cluster down to run manual compaction on >> some disks :-(, but that's a known bug where Igor is working on. (Kudos to >> him when I get my sleep back at night for this one...) >> >> Though, I'm still having an issue which I don't completely understand. >> When I look into the Ceph dashboard - OSDs, I can see the #pgs for a >> specific OSD. Does someone know how this is calculated? Because it seems >> incorrect... >> E.g. A specific disk shows in the dashboard 189 PGs...? However, >> examining the pg dump output I can see that for that particular disk there >> are 145 PGs where the disk is in the "up" list, and 168 disks where that >> particular disk is in the "acting" list... Of those 2 lists, 135 are in >> common, meaning 10 PGs will need to be moved to that disk, while 33 PGs >> will need to be moved away... >> I can't figure out how the dashboard is getting to the figure of 189... >> It's also on other disks (a delta between the PG dump output and the info >> in the Ceph dashboard). >> >> Another example is one disk which I've put on weight 0 as it's marked to >> have a predictable failure in the future... So the list with "up" is 0 >> (which is correct), and the PGs where this disk is in acting is 49. So, >> this seems correct as these 49 PGs need to be moved away. However... >> Looking into the Ceph dashboard the UI is saying that there are 71 PGs on >> that disk... >> >> So: >> - How does the Ceph dashboard get that number in the 1st place? >> - Is there a possibility that there are "orphaned" PG-parts left behind >> on a particular OSD? >> - If it is possible that there are orphaned parts of a PG left behind on >> a disk, how do I clean this up? >> >> I've also tried examining the osdmap, however, the output seems to be >> limited(??). I only see the PGs voor pool 1 and 2. (I don't know if the >> file is concatenated by exporting the osd map, or by the osdmaptool >> --print). >> >> The cluster is running Nautilus v14.2.11, all on the same version. >> >> I'll make some time writing documentation and documenting my findings >> which I've all faced in the journey of the last 2 weeks.... Kristof in >> Ceph's wunderland... >> >> Thanks for all your input so far! >> >> Regards, >> >> Kristof >> >> >> >> Op wo 21 okt. 2020 om 14:01 schreef Frank Schilder <frans@xxxxxx<mailto: >> frans@xxxxxx>>: >> There have been threads on exactly this. Might depend a bit on your ceph >> version. We are running mimic and have no issues doing: >> >> - set noout, norebalance, nobackfill >> - add all OSDs (with weight 1) >> - wait for peering to complete >> - unset all flags and let the rebalance loose >> >> Starting with nautilus there seem to be issues with this procedure. >> Mainly the peering phase can cause a collapse of the cluster. In your >> case, it sounds like you added the OSDs already. You should be able to do >> relatively safely: >> >> - set noout, norebalance, nobackfill >> - set weight of OSDs to 1 one by one and wait for peering to complete >> every time >> - unset all flags and let the rebalance loose >> >> I believe once the peering succeeded without crashes, the rebalancing >> will just work fine. You can easily control how much rebalancing is going >> on. >> >> I noted that ceph seems to have a strange concept of priority though. I >> needed to gain capacity by adding OSDs and ceph was very consequent with >> moving PGs from the fullest OSDs last. The opposite of what should happen. >> Thus, it took ages for additional capacity to become available and also the >> backfill too full warnings stayed for all the time. You can influence this >> to some degree by using force_recovery commands on PGs on the fullest OSDs. >> >> Best regards and good luck, >> ================= >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 >> >> ________________________________________ >> From: Kristof Coucke <kristof.coucke@xxxxxxxxx<mailto: >> kristof.coucke@xxxxxxxxx>> >> Sent: 21 October 2020 13:29:00 >> To: ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> >> Subject: Question about expansion existing Ceph cluster - >> adding OSDs >> >> Hi, >> >> I have a cluster with 182 OSDs, this has been expanded towards 282 OSDs. >> Some disks were near full. >> The new disks have been added with initial weight = 0. >> The original plan was to increase this slowly towards their full weight >> using the gentle reweight script. However, this is going way too slow and >> I'm also having issues now with "backfill_toofull". >> Can I just add all the OSDs with their full weight, or will I get a lot of >> issues when I'm doing that? >> I know that a lot of PGs will have to be replaced, but increasing the >> weight slowly will take a year at the current speed. I'm already playing >> with the max backfill to increase the speed, but every time I increase the >> weight it will take a lot of time again... >> I can face the fact that there will be a performance decrease. >> >> Looking forward to your comments! >> >> Regards, >> >> Kristof >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto: >> ceph-users-leave@xxxxxxx> >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx