Hi, After the upgrade to 13.2.8 deep-scrub has a big impact on client IO: loads of SLOW_OPS and high latency. We hardly ever had SLOW_OPS, but since the upgrade the impact is so big that we even have OSDs marking each other out (OSD op thread timeout) multiple times during the scrub window. Plenty of CPU / RAM / IOPS left, hardly any load on these OSD servers. Has there anything changed in this release that can explain this behaviour? Besides this the impact of rebalance is very severe as well. With only the balancer remapping a couple of PGs at a time there are loads of (MDS_)SLOW_OPS. This morning the cephfs metadata pool got rebalanced ... and that triggered a lot of SLOW_OPS. One particular OSD was pegged at 1000% CPU for more than half an hour (not doing that much IO): that's 10 cores going full throttle! After a restart this issue was gone. Thanks, Stefan -- | BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx