Hi, after enabling ceph balancer (with command ceph balancer on) the health status changed to error. This is the current output of ceph health detail: root@ld3955:~# ceph health detail HEALTH_ERR 1438 slow requests are blocked > 32 sec; 861 stuck requests are blocked > 4096 sec; mon ld5505 is low on available space REQUEST_SLOW 1438 slow requests are blocked > 32 sec 683 ops are blocked > 2097.15 sec 436 ops are blocked > 1048.58 sec 191 ops are blocked > 524.288 sec 78 ops are blocked > 262.144 sec 35 ops are blocked > 131.072 sec 11 ops are blocked > 65.536 sec 4 ops are blocked > 32.768 sec osd.62 has blocked requests > 65.536 sec osds 39,72 have blocked requests > 262.144 sec osds 6,19,67,173,174,187,188,269,434 have blocked requests > 524.288 sec osds 8,16,35,36,37,61,63,64,68,73,75,178,186,271,369,420,429,431,433,436 have blocked requests > 1048.58 sec osds 3,5,7,24,34,38,40,41,59,66,69,74,180,270,370,421,432,435 have blocked requests > 2097.15 sec REQUEST_STUCK 861 stuck requests are blocked > 4096 sec 25 ops are blocked > 8388.61 sec 836 ops are blocked > 4194.3 sec osds 2,28,29,32,60,65,181,185,268,368,423,424,426 have stuck requests > 4194.3 sec osds 0,30,70,71,184 have stuck requests > 8388.61 sec I understand that when balancer starts shifting PGs to other OSDs that this caused IO load on the cluster. However I don't understand why this is affecting OSD so heavily. And I don't understand why OSD of specific type (SSD, NVME) suffer although there's no balancing occuring on them. Regards Thomas _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx