Hi Sage, I've been trying to solve the issue mentioned in tracker #2047, which I think is the same as I described in http://www.spinics.net/lists/ceph-devel/msg05824.html The attached patches seem to fix it for me. I also attempted to address the local search issue you mentioned in #2047. I'm testing this using a cluster with 3 rows, 2 racks/row, 2 hosts/rack, 4 osds/host. I tested against a CRUSH map with the rules: step take root step chooseleaf firstn 0 type rack step emit I'm in the processes of testing this as follows: I wrote some data to the cluster, then started shutting down OSDs using "init-ceph stop osd.n". For the first rack's worth, I shut OSDs down sequentially. I waited for recovery to complete each time before stopping the next OSD. For the next rack I shut down the first 3 OSDs on a host at the same time, waited for recovery to complete, then shut down the last OSD on that host. For the next racks, I shut down all the OSDs on the hosts in the rack at the same time. Right now I'm waiting for recovery to complete after shutting down the third rack. Once recovery completed after each phase so far, there were no degraded objects. So, this is looking fairly solid to me so far. What do you think? Thanks -- Jim Jim Schutt (2): ceph: retry CRUSH map descent before retrying bucket ceph: retry CRUSH map descent from root if leaf is failed src/crush/mapper.c | 30 ++++++++++++++++++++++-------- 1 files changed, 22 insertions(+), 8 deletions(-) -- 1.7.8.2 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html