On Fri, 13 Mar 2015, Charles 'Boyo wrote: > Hello all. > > When, if ever, will Ceph clients have the ability to prefer certain > OSDs/hosts over others? > > I am running 3 replica pools across 3 data centers connected by > relatively narrow links. Writes have to travel out anyway but I'd prefer > to keep reads local. > > The thinking is that since all writes are synchronous across all > replicas, it should be okay to read from a secondary that is nearer, > faster or lightly loaded instead of reaching out to the primary at all > times. > > Can this be done with a CRUSH tweak? > Any ideas? All of the logic to do this is in place on the client side, including the ability to choose teh replica closest to the client based on distance within the CRUSH hierarchy (based on nearest shared ancestor). What's missing is some additional work on the OSD side to make sure that these reads are safe when they race with writes or (I believe) when the cluster us undergoing some sort of rebalancing. There are a few tickets in the tracker for these issues. The first step to track them down would be to extend teh ceph_test_rados tool to issue reads to replicas, and then to issue racing reads and writes. With the testing tools in place we can build some confidence that our fixes our correctly and that we've addressed all the issues... sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html