On 2020-09-01 10:51, Marcel Kuiper wrote: > As a matter of fact we did. We doubled the storage nodes from 25 to 50. > Total osds now 460. > > You want to share your thoughts on that? OK, I'm really curious if you observed the following behaviour: During, or shortly after the rebalance, did you see high CPU usage of the OSDs? In particular the ones that hosted the PGs before they were moved to the new nodes? As in ~ 300 % CPU per OSD (increasing from a few percent to 300% non stop)? RocksDB is doing housekeeping, And we observed before, and today again, on Mimic 13.2.8, that with a lot of OMAP/META data the OSDs that have to clean up consume a ridiculous amount of CPU (for hours on end). Triggering loads of slow ops and latency spikes in the somtimes (tens) of seconds. Are you running nautilus? If you haven't seen this behaviour this might have been fixed in Nautlilus. Or you cluster is different from ours. We will do PG expansion after we have upgraded to Nautilus, so we'll definitely know by then. Thanks, Stefan _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx