Re: Questions about OSD recovery

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 8, 2012 at 19:14, Josh Durgin <josh.durgin@xxxxxxxxxxxxx> wrote:
> It's possible to do what the current repair code does
> automatically, but this would be a bad idea since it just takes
> the first osd (with primary before replicas) to have the object
> as authoritative, and copies it to all the relevant osds. If the
> primary has a corrupt copy, this corruption will spread to other
> osds. In your case, since you removed the object entirely, repair
> could correct it.

At the risk of saying the obvious.. If you have >=3 copies, you could
hash them all, and let the majority decide which is the "good" copy.

An admin could do this manually, just deleting the bad one and letting
scrub repair it, and later on we might be able to automate it.

I'm not sure if Dynamo's/Cassandra's anti-entropy feature does this,
or if it's a simple "master overwrites slaves", and I realize the
multi-party communication is sort of hard to coordinate, but it's
definitely possible. I loves me some Merkle trees.

Of course, there might be cases where e.g. all 3 replicas have
different content.

In many ways, getting a hash stored alongside is object is
significantly better, and might be a better route to go -- our objects
are big enough, as opposed to typical Dynamo/Cassandra cells that are
often smaller than a sha1.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux