> > In my experience, ceph will add around 1ms even if only on localhost. If > this is in the client code or on the OSD's, I dont really know. I don't > even know the precise reason, but the latency is there nevertheless. > Perhaps you can find the reason here among the tradeoffs ceph and > similar systems have to make to ensure consistency even if a partition > can happen at any time: > > https://en.wikipedia.org/wiki/PACELC_theorem > > With size=3, a write will go first to the primary OSD for the PG, > (0,1ms), then from there to two more PGs (in parallell), so 0,2ms more > total RTT. Then back to the client, 0,1ms. That is, very roughly, 1,4ms > even if storage latency is 0 which it never is even for ssds. > > If you set size=1, you can skip the step where the primary OSD > replicates to the 2 replicas, but you still have cephs internal latency > as well as the network latency to reach the primary OSD for whatever PG > the object will belong to which could be on any server. So expect a > small improvement but not too much. > > With that said, a single thread will not exceed 1000 iops ever in a > typical setup. > Do you have an idea how did this progress over the versions last few years? I thought they were addressing this type of performance issue. I can remember that when moving from direct disk access to using lvm there were also people complianing about added latency. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx