On Sat, Sep 6, 2014 at 7:50 AM, Dan van der Ster <daniel.vanderster at cern.ch> wrote: > > BTW, do you happen to know, _if_ we re-use an OSD after the journal has > failed, are any object inconsistencies going to be found by a > scrub/deep-scrub? > I haven't tested this, but I did something I *think* is similar. I deleted an OSD, removed it from the crushmap, marked it lost, then added it back without reformatting. It got the same OSD ID. I think I spent about 10 minutes doing it. I don't remember exactly why... I think I was trying to force_pg_create or something. If I recall correctly, the backfill was much faster than I expected. It should have taken >24 hours. IIRC, it completed in about 2 hours. It wasn't as fast as marking the OSD out and in, but much faster than a freshly formatted OSD. It's possible that this only worked because the PGs hadn't completed backfilling. Despite my marking the OSD lost, the OSD was still listed in the pg query, in the osds to probe section. I want to experiment with losing an SSD. I'm trying to think of a way to run the test using VMs, but I haven't come up with anything yet. All of my test clusters are virtual, and I'm not ready to test this on a production cluster yet. I *think* losing an SSD will be similar to the above, possibly followed by some inconsistencies found during scrub and deep-scrub. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140909/7d285602/attachment-0001.htm>