Hi, I'm facing several issues with my ceph cluster (2x MDS, 6x ODS). Here I would like to focus on the issue with pgs backfill_toofull. I assume this is related to the fact that the data distribution on my OSDs is not balanced. This is the current ceph status: root@ld3955:~# ceph -s cluster: id: 6b1b5117-6e08-4843-93d6-2da3cf8a6bae health: HEALTH_ERR 1 MDSs report slow metadata IOs 78 nearfull osd(s) 1 pool(s) nearfull Reduced data availability: 2 pgs inactive, 2 pgs peering Degraded data redundancy: 304136/153251211 objects degraded (0.198%), 57 pgs degraded, 57 pgs undersized Degraded data redundancy (low space): 265 pgs backfill_toofull 3 pools have too many placement groups 74 slow requests are blocked > 32 sec 80 stuck requests are blocked > 4096 sec services: mon: 3 daemons, quorum ld5505,ld5506,ld5507 (age 98m) mgr: ld5505(active, since 3d), standbys: ld5506, ld5507 mds: pve_cephfs:1 {0=ld3976=up:active} 1 up:standby osd: 368 osds: 368 up, 367 in; 302 remapped pgs data: pools: 5 pools, 8868 pgs objects: 51.08M objects, 195 TiB usage: 590 TiB used, 563 TiB / 1.1 PiB avail pgs: 0.023% pgs not active 304136/153251211 objects degraded (0.198%) 1672190/153251211 objects misplaced (1.091%) 8564 active+clean 196 active+remapped+backfill_toofull 57 active+undersized+degraded+remapped+backfill_toofull 35 active+remapped+backfill_wait 12 active+remapped+backfill_wait+backfill_toofull 2 active+remapped+backfilling 2 peering io: recovery: 18 MiB/s, 4 objects/s Currently I'm using 6 OSD nodes. Node A 48x 1.6TB HDD Node B 48x 1.6TB HDD Node C 48x 1.6TB HDD Node D 48x 1.6TB HDD Node E 48x 7.2TB HDD Node F 48x 7.2TB HDD Question: Is it advisable to distribute the drives equally over all nodes? If yes, how should this be executed w/o ceph disruption? Regards Thomas _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx