On Tue, Jul 9, 2013 at 7:08 PM, Li Wang <liwang@xxxxxxxxxxxxxxx> wrote: > Hi, > We did a simple throughput test on Ceph with 2 OSD nodes configured > with one replica policy. For each OSD node, the throughput measured by 'dd' > run locally is 117MB/s. Therefore, in theory, the two OSDs could provide > 200+MB/s throughput. However, using 'iozone' from clients we only get a peak > throughput at around 40MB/s. Is that because a write will incur 2 replica * > 2 journal write? That is, writing into journal > and replica doubled the traffic, respectively, which results in a total > 4x write amplification. Is that true, or our understanding is wrong, > and some performance tuning hint? Well, that write amplification is certainly happening. However, I'd expect to get a better ratio out of the disk than that. Couple things to check: 1) is your benchmark dispatching multiple requests at a time, or just one? (the latency on a single request is going to make throughput numbers come out badly.) 2) How do your disks handle two simultaneous write streams? (Most disks should be fine, but sometimes they struggle.) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html