Radosgw halts writes during recovery, recovery info issues

Josef Zelenka <josef.zelenka@xxxxxxxxxxxxxxxx> · Mon, 26 Mar 2018 11:30:05 +0200

Hi everyone, i'm currently fighting an issue in a cluster we have for a 
customer. It's used for a lot of small files(113m currently) that are 
pulled via radosgw. We have 3 nodes, 24 OSDs in total. the index etc 
pools are migrated to a separate root called "ssd", that root is on only 
ssd drives - each node has one ssd in this root. We did this because we 
had an issue where if a normal OSD(an HDD) crashed, the entire rgw 
stopped working. Today, one of the SSDs crashed and after changing the 
drive, putting a new one in and starting recovery, RGW halted writes. 
Read worked ok, but we couldn't upload any more files to it. The 
non-data pools all have size set to 3, so there should still be 2 
healthy copies of the index data. Also, when recovery started, no 
recovery i/o was shown in the ceph -s output, so we checked it through 
df, after the ssd backfilled, ceph -s went from X degraded pgs back to 
OK instantly. Does anyone know how to fix these? i don't think writes 
should be halted during recovery.

Thanks

Josef Z

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com