Hi, we're using Ceph to serve VM images via RBD and thus, RBD performance is important for us. I've prepared some write benchmarks using different object sizes. One time I use 'rados bench' directly and the other time 'rbd bench-write'. The results are interesting: Raw RADOS write rates are significantly better for large objects (>128k), RBD performs better for medium sized objects (>16k, <128k), but RBD is really slow for small writes. We have lot of small writes, so this is the pain point. I think latencies are dominant here. Our test setup consists of two Ceph servers running a MON and 9 OSDs (one OSD daemon per disk; ext4 filesystem) with journals on a shared SSD (one SSD partition per OSD). There are 2 GigE networks (storage frontend/backend) with approx 62µs RTT and jumbo frames enabled. See attached ceph.conf further details. Some parameters there are taken from the tuning recommendations at [1]. Note that I have to stick to ext4 on the OSDs. Is there anything we can do to improve latencies? I don't know where to start: * OSD setup? * Network setup? * ceph.conf parameter tuning? * Separate MONs? * Separate networks for MON access? A lot of options... so I would be grateful for hints what is worth looking at. Please refer to bitbucket[2] for benchmark scripts. TIA Christian [1] http://ceph.com/community/ceph-bobtail-jbod-performance-tuning/ [2] https://bitbucket.org/ckauhaus/ceph_performance -- Dipl.-Inf. Christian Kauhaus <>< · kc@xxxxxxxxxx · systems administration gocept gmbh & co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany http://gocept.com · tel +49 345 219401-11 Python, Pyramid, Plone, Zope · consulting, development, hosting, operations
Attachment:
rados_vs_rbd_write_performance.png
Description: PNG image
[global] fsid = b67bad36-3273-11e3-a2ed-0200000311bf public network = 172.20.4.0/24 cluster network = 172.20.8.0/24 osd pool default min size = 1 osd pool default size = 2 osd pool default pg num = 25 osd pool default pgp num = 25 mon host = cartman02.sto.dev.gocept.net,kyle02.sto.dev.gocept.net,patty.sto.dev.gocept.net ms dispatch throttle bytes = 335544320 [client] log file = /var/log/ceph/client.log rbd cache = true rbd default format = 2 [mon] mon host = cartman02,kyle02,patty mon addr = 172.20.4.6:6789,172.20.4.9:6789,172.20.4.10:6789 mon data = /srv/ceph/mon/$cluster-$id [mon.cartman02] host = cartman02 mon addr = 172.20.4.6:6789 public addr = 172.20.4.6:6789 cluster addr = 172.20.8.4:6789 [mon.kyle02] # ... [osd] public addr = 172.20.4.6 cluster addr = 172.20.8.4 filestore fiemap = true filestore op threads = 1 filestore queue committing max bytes = 167772160 filestore queue max bytes = 167772160 filestore xattr use omap = true journal max write bytes = 167772160 journal queue max bytes = 167772160 osd deep scrub interval = 2592000 osd journal size = 0 osd op threads = 4 [osd.0] host = cartman02 osd uuid = c4b6d576-86d3-5e9a-9661-36b1fa36f4cf osd data = /srv/ceph/osd/ceph-0 osd journal = /dev/vgjnl00/ceph-jnl00 filestore max sync interval = 102 [osd.1] # ...
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com