Am 23.05.2012 10:30, schrieb Stefan Priebe - Profihost AG: > Am 22.05.2012 23:11, schrieb Greg Farnum: >> On Tuesday, May 22, 2012 at 2:00 PM, Stefan Priebe wrote: >>> Am 22.05.2012 22:49, schrieb Greg Farnum: >>>> Anyway, it looks like you're just paying a synchronous write penalty >>> >>> >>> What does that exactly mean? Shouldn't one threaded write to four >>> 260MB/s devices gives at least 100Mb/s? >> >> Well, with dd you've got a single thread issuing synchronous IO requests to the kernel. We could have it set up so that those synchronous requests get split up, but they aren't, and between the kernel and KVM it looks like when it needs to make a write out to disk it sends one request at a time to the Ceph backend. So you aren't writing to four 260MB/s devices; you are writing to one 260MB/s device without any pipelining — meaning you send off a 4MB write, then wait until it's done, then send off a second 4MB write, then wait until it's done, etc. >> Frankly I'm surprised you aren't getting a bit more throughput than you're seeing (I remember other people getting much more out of less beefy boxes), but it doesn't much matter because what you really want to do is enable the client-side writeback cache in RBD, which will dispatch multiple requests at once and not force writes to be committed before reporting back to the kernel. Then you should indeed be writing to four 260MB/s devices at once. :) > > OK i understand that but still the question where is the bottlenek in > this case. I mean i see not more than 40% network load, not more than > 10% cpu load and only 40MB/s to the SSD. I would still expect a network > load of 70-90%. *gr* i found a broken SATA cable ;-( this is now with the replaced SATA cable and with rbd cache turned on: systembootimage:/mnt# dd if=/dev/zero of=test bs=4M count=1000 1000+0 records in 1000+0 records out 4194304000 bytes (4,2 GB) copied, 57,9194 s, 72,4 MB/s systembootimage:/mnt# dd if=test of=/dev/null bs=4M count=1000 1000+0 records in 1000+0 records out 4194304000 bytes (4,2 GB) copied, 46,3499 s, 90,5 MB/s rados write bench 8 threads: Total time run: 60.222947 Total writes made: 1519 Write size: 4194304 Bandwidth (MB/sec): 100.892 Average Latency: 0.317098 Max latency: 1.88908 Min latency: 0.089681 Stefan -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html