We did have a peering storm, we're past that portion of the backfill and still experiencing new instances of rbd volumes hanging. It is for sure not just the peering storm. We've got 22.184% objects misplaced yet, with a bunch of pgs left to backfill (like 75k). Our rbd poll is using about 1.7PiB of storage, so we're looking at like 370TiB yet to backfill, rough estimate. This specific pool is using replicated encoding, with size=3. RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 21 PiB 11 PiB 10 PiB 10 PiB 48.73 TOTAL 21 PiB 11 PiB 10 PiB 10 PiB 48.73 POOLS: POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL pool1 4 32768 574 TiB 147.16M 1.7 PiB 68.87 260 TiB We did see a lot of rbd volumes that hung, often giving the buffer i/o errors previously sent - whether that was the peering storm or backfills is uncertain. As suggested, we've already been detaching/reattaching the rbd volumes, pushing the primary active osd for pgs to another, and sometimes rebooting the kernel on the vm to clear the io queue. A combination of those brings the rbd volume block device back for a while. We're no longer in a peering storm and we're seeing the rbd volumes going into an unresponsive state again - including osds where they were unresponsive, we did things and got them responsive, and then they went unresponsive again. All pgs are in an active state, some active+remapped+backfilling, some active+undersized+remapped+backfilling, etc. We also run the object gateway off the same cluster with the same backfill, the object gateway is not experiencing issues. Also the osds patricipating in the backfill are not saturated with i/o, or seeing abnormal load for our usual backfill operations. But with the continuing backfill, we're seeing rbd volumes on active pgs going back into a blocked state. We can do about the same with detaching the volume / bouncing the pg to a new primary acting osd, but we'd rather have these stop going unresponsive in the first place. Any suggestions towards that direction are greatly appreciated. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx