2012/1/19 Tommi Virtanen <tommi.virtanen@xxxxxxxxxxxxx>: > On Wed, Jan 18, 2012 at 13:57, Andrey Stepachev <octo47@xxxxxxxxx> wrote: >> Thanks for clarification. I don't dig yet, how data moved when osd >> marked as out and how pg calculated for this data. (in other words, >> how new place for missing replica calculated, and what happens, >> when out osd returns back) > > That would be the CRUSH algorithm just getting a different cluster > state as input, and thus deciding on a new location for (some of) the > data. At that point, any data that needs to be migrated, would be > migrated. If the same osd id returns (on new hardware or old), once > it's marked "in", the CRUSH algorithm will again start placing data on > it. Whether that's the same set of data or not depends on whether > anything else changed in the cluster, in the meanwhile. > > Currently, CRUSH is best described in the academic papers about Ceph. > There's a quick braindump/simplified explanation at > http://ceph.newdream.net/docs/latest/dev/placement-group/ Thanks. I'll red that. > >> We don't try to use Ceph in true WAN. But we try to find some dfs, which will >> operate well in our environment: multiple dc, relatively low latency >> (<10ms as average), >> irregular dc outages. And we need synchronous replication and we need >> bigtable (hbase in case of ceph). >> I believe, that ceph can be configured to operate in such environment, but >> I can be completely wrong. And I try to check some boundary conditions. > > As long as you understand that network blip translates to storage > blip, with Ceph. That is, we don't just write to master and hope that > replication catches up at some later time. Yes, but in case of dc outage with master, no writes possible at all. So I try to trade some dfs perfomance for HA in several dc. > >> Yeah, it is good to see some progress (like mdadm shows), but it is not >> critical. Can you point me, who does this job. MDS? > > This is answered in http://ceph.newdream.net/docs/latest/dev/delayed-delete/ Looks like this is a prime place, where to read. Not wiki. Thanks, I'll read whole docs. -- Andrey. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html