> [ Please stay on the list. :) ] Doh. Was trying to get Outlook to quote properly, and forgot to hit Reply-all. :) > >> The specifics of what data will migrate where will depend on how > >> you've set up your CRUSH map, when you're updating the CRUSH > >> locations, etc, but if you move an OSD then it will fully participate > >> in recovery and can be used as the authoritative source for data. > > > > Ok, so if data chunk "bar" lives only on OSDs 3, 4, and 5, and OSDs 3, > > 4, and 5 suddenly vanish for some reason but then come back later > > (with their data intact), the cluster will recover more-or-less > > gracefully? That is, it *won't* go "sorry, your RBD 'foobarbaz' lost > > 'bar' for a while, all that data is gone"? I would *assume* it has a > > way to recover more-or-less gracefully, but it's also not something I > > want to discover the answer to later. :) > > Well, if the data goes away and you try and read it, the request is just going > to hang, and presumably eventually the kernel/hypervisor/block device > whatever will time out and throw an error. At that point you have a choice > between marking it lost (at which point it will say ENOENT to requests to > access it, and RBD will turn that into zero blocks) or getting the data back > online. When you do bring it back online, it will peer and then be accessible > again without much fuss (if you only bring back one copy it might kick off a > bunch of network and disk traffic re-replicating). Awesome, that's exactly how I would want it to work. If the drives themselves somehow manage to all catch on fire at the same time, I can still recover some of the data on the RBD, but as long as the drives are ok I should be able to bring the data back. Thanks for your help, I really appreciate it! Now to do some testing and fiddling. :) _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com