Just a shoot, Perhaps you have many backfilling tasks... You can trottle the recovery when you limit max-backfill. Hth Mehmet Am 2. Dezember 2022 21:09:04 MEZ schrieb Wyll Ingersoll <wyllys.ingersoll@xxxxxxxxxxxxxx>: > >We have a large cluster (10PB) which is about 30% full at this point. We recently fixed a configuration issue that then triggered the pg autoscaler to start moving around massive amounts of data (85% misplaced objects - about 7.5B objects). The misplaced % is dropping slowly (about 10% each day), but the overall data usage is growing by about 300T/day even though the data being written by clients is well under 30T/day. > >The issue was that we have both 3x replicated pools and a very large erasure-coded (8+4) data pool for RGW. autoscaler doesnt work if it sees what it thinks are overlapping roots ("default" vs "default~hdd" in the crush tree, even if both refer to the same OSDs, they have different ids: -1 vs -2). We cleared that by setting the same root for both crush rules and then PG autoscaler kicked in and started doing its thing. > >The "ceph osd df" output shows the OMAP jumping significantly and our data availability is shrinking MUCH faster than we would expect based on the client usage. > >Questions: > > * What is causing the OMAP data consumption to grow so fast and can it be trimmed/throttled? > * Will the overhead data be cleaned up once the misplaced object counts drop to a much lower value? > * Would it do any good to disable the autoscaler at this point since the PGs have already started being moved? > * Any other recommendations to make this go smoother? > >thanks! > >_______________________________________________ >ceph-users mailing list -- ceph-users@xxxxxxx >To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx