On Tue, Jul 23, 2013 at 9:12 AM, Matthew Walster <matthew@xxxxxxxxxxx> wrote: > On 23 July 2013 17:07, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >> >> If you have three osds that are >> separated by 5ms each and all hosting a PG, then your lower-bound >> latency for a write op is 10ms — 5 ms to send from the primary to the >> replicas, 5ms for them to ack back. > > > And without wanting to sound daft having missed a salient configuration > detail, but there's no way to release when it's written the primary? Definitely not. Ceph's consistency guarantees and recovery mechanisms are all built on top of all the replicas having a consistent copy and that breaks if you do primary-only acks. Maybe in the future something like this will happen, but it's all very blue-sky right now. > Likewise, there's no way of influencing a client to write to a particular > structure in the CRUSH map in preference? i.e. influence the write so it > tries to read/write from local where possible/available? Essentially I'm > saying "if I have a structure of DC, rack, server, disk; can I say "this > client is part of this DC, operate here first" and let the OSDs deal with > the replication? You can do things like say "this data is always accessed from this location" and set up your pools and crush rules to associate the data with a location; you cannot write to arbitrary replicas. There is some limited work around doing things like "read from local host if it's there" (which exists now) and "read from the closest CRUSH item" (which I think exists in a branch somewhere, but I'm not sure), but it's got some consistency issues right now (possible to get stale data). -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com