On Thu, Aug 22, 2013 at 2:34 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
You don't appear to have accounted for the 2x replication (where allwrites go to two OSDs) in these calculations. I assume your pools have
Ah. Right. So I should then be looking at:
# OSDs * Throughput per disk / 2 / repl factor ?
Which makes 300-400 MB/s aggregate throughput actually sort of reasonable.
size 2 (or 3?) for these tests. 3 would explain the performance
difference entirely; 2x replication leaves it still a bit low but
takes the difference down to ~350/600 instead of ~350/1200. :)
Yeah. We're doing 2x repl now, and haven't yet made the decision if we're going to move to 3x repl or not.
You mentioned that your average osd bench throughput was ~50MB/s;
what's the range?
41.9 - 54.7 MB/s
The actual average is 47.1 MB/s
Have you run any rados bench tests?
Yessir.
rados bench write:
2013-08-23 00:18:51.933594min lat: 0.071682 max lat: 1.77006 avg lat: 0.196411
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
900 14 73322 73308 325.764 316 0.13978 0.196411
Total time run: 900.239317
Total writes made: 73322
Write size: 4194304
Bandwidth (MB/sec): 325.789
Stddev Bandwidth: 35.102
Max bandwidth (MB/sec): 440
Min bandwidth (MB/sec): 0
Average Latency: 0.196436
Stddev Latency: 0.121463
Max latency: 1.77006
Min latency: 0.071682
I haven't had any luck with the seq bench. It just errors every time.
What is your PG count across the cluster?
pgmap v18263: 1650 pgs: 1650 active+clean; 946 GB data, 1894 GB used, 28523 GB / 30417 GB avail; 498MB/s wr, 124op/s
Thanks again.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com