Den fre 14 okt. 2022 kl 12:10 skrev Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>: > I've added 5 more nodes to my cluster and got this issue. > HEALTH_WARN 2 backfillfull osd(s); 17 pool(s) backfillfull; Low space hindering backfill (add storage if this doesn't resolve itself): 4 pgs backfill_toofull > OSD_BACKFILLFULL 2 backfillfull osd(s) > osd.150 is backfill full > osd.178 is backfill full > > I read in the mail list that I might need to increase the pg on the some pool to have smaller pgs. > Also read I might need to reweigt the mentioned full osd with 1.2 until it's ok, then set back. > Which would be the best solution? It is not unusual to see "backfill_toofull", especially if the reason for expanding was that space was getting tight. When you add new drives, a lot of PGs need to move, not only from "old OSDs to new" but in all possible directions. As an example, if you had 16 PGs and three hosts (A,B and C), the PGs would end up something like: A 1,4,7,10,13,16 B 2,5,8,11,14 C 3,6,9,12,15 (5-6 PGs per host) Then you add host D and E, now it should become something like: A 1,6,11,16 B 2,7,12 C 3,8,13 D 4,9,14 E 5,10,15 (3-4 PGs per host) >From here we can see that A will keep PG 1 and 16, B will keep PG 2, C keeps PG 3, but more or less ALL the other PGs will be moving about. D and E will of course get PGs because they are added, but A will send PG 7 to host B, B send PG 8 to host C and so on. If A,B and C are almost full and you add new OSDs (D and E), the cluster will try to schedule *all* the moves. Of course pgs 4,5,9,10,14 and 15 can just start copying at any time since D and E are empty when they arrive, but the cluster will also ask A to send PG 7 to B, and B will try to send PG 8 to C, and if PG 7 makes B go past backfill_full limit, or of PG 8 makes host C pass it, they will pause those moves with the state backfill_toofull and just have them being "misplaced"/"remapped". In the meantime, the other moves are going to get handled, and sooner or later, the host B and C will have moved off so much data so that PG 7 and 8 can move to their correct places, but this might mean those will be among the last to move about. The reality is not 100% as simple as this, the straw2 bucket placing algorithm tries to help prevent parts of this, and there might be cases where two of the old hosts would send PGs to each other, basically just swapping them around and the point that any PG is made up of ECk+m/#replica parts makes this explanation a bit too simple, but in broad terms, this is why you get "errors" when adding new empty drives and it is perfectly ok, and will fix itself as soon as the other moves have created space enough for the queued-toofull moves to be performed without driving an OSD over the limits. -- May the most significant bit of your life be positive. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx