On Tue, 4 Jun 2013, Nigel Williams wrote: > On 4/06/2013 9:16 AM, Chen, Xiaoxi wrote: > > my 0.02? you really dont need to wait for health_ok between your > > recovery steps,just go ahead. Everytime a new map be generated and > > broadcasted,the old map and in-progress recovery will be canceled > > thanks Xiaoxi, that is helpful to know. > > It seems to me that there might be a failure-mode (or race-condition?) > here though, as the cluster is now struggling to recover as the > replacement OSD caused the cluster to go into backfill_toofull. > > The failure sequence might be: > > 1. From HEALTH_OK crash an OSD > 2. Wait for recovery > 3. Remove OSD using usual procedures > 4. Wait for recovery > 5. Add back OSD using usual procedures > 6. Wait for recovery > 7. Cluster is unable to recover due to toofull conditions > > Perhaps this is a needed test case to round-trip a cluster through a > known failure/recovery scenario. > > Note this is using a simplistically configured test-cluster with CephFS > in the mix and about 2.5 million files. > > Something else I noticed: I restarted the cluster (and set the leveldb > compact option since I'd run out of space on the roots) and now I see it > is again making progress on the backfill. Seems odd that the cluster > pauses but a restart clears the pause, is that by design? Does the monitor data directory share a disk with an OSD? If so, that makes sense: compaction freed enough space to drop below the threshold... sage _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com