On Wed, 6 Aug 2014 08:05:33 -0700 (PDT) Sage Weil wrote: > On Wed, 6 Aug 2014, Mark Nelson wrote: > > On 08/05/2014 06:19 PM, Mark Kirkwood wrote: > > > On 05/08/14 23:44, Mark Nelson wrote: > > > > On 08/05/2014 02:48 AM, Mark Kirkwood wrote: > > > > > On 05/08/14 03:52, Tregaron Bayly wrote: > > > > > > Does anyone have any insight on how we can tune librbd to > > > > > > perform closer > > > > > > to the level of the rbd kernel module? > > > > > > > > > > > > In our lab we have a four node cluster with 1GbE public > > > > > > network and 10GbE cluster network. A client node connects to > > > > > > the public network with 10GbE. > > > > > > > > > > > > When doing benchmarks on the client using the kernel module we > > > > > > get decent performance and can cause the OSD nodes to max out > > > > > > their 1GbE link at peak servicing the requests: > > > > > > > > > > > > tx rx > > > > > > max 833.66 Mbit/s | 639.44 Mbit/s > > > > > > max 938.06 Mbit/s | 707.35 Mbit/s > > > > > > max 846.78 Mbit/s | 702.04 Mbit/s > > > > > > max 790.66 Mbit/s | 621.92 Mbit/s > > > > > > > > > > > > However, using librbd we only get about 30% of performance and > > > > > > I can see > > > > > > that it doesn't seem to generate requests fast enough to max > > > > > > out the links on OSD nodes: > > > > > > > > > > > > max 309.74 Mbit/s | 196.77 Mbit/s > > > > > > max 300.15 Mbit/s | 154.38 Mbit/s > > > > > > max 263.06 Mbit/s | 154.38 Mbit/s > > > > > > max 368.91 Mbit/s | 234.38 Mbit/s > > > > > > > > > > > > I know that I can play with cache settings to help give the > > > > > > client better service on hits, but I'm wondering how I can > > > > > > soup up librbd so that it can take advantage of more of the > > > > > > speed available in the cluster. It seems like using librbd > > > > > > will leave a lot of the resources idle. > > > > > > > > > > > > > > > Hi Tregaron, > > > > > > > > > > I'm guessing that in the librbd case you are injecting the > > > > > volume into a VM before running your tests - might be > > > > > interesting to see your libvirt XML for the VM... in particular > > > > > the 'cache' setting for the rbd volume. If this are not set or > > > > > is 'default' then changing to 'none' will probably be > > > > > significantly faster. In addition adding: > > > > > > > > > > io='native' > > > > > > > > > > may give a bit of a boost too! > > > > > > > > Oh, that reminds me, also make sure to use the virtio bus instead > > > > of ide or something else. That can make a very large performance > > > > difference. > > > > > > > > > > Yes, good point Mark (man this plethora of Marks is confusing...). > > > That reminds me, we currently have some libvirt configs in the docs > > > that use > > > > > > bus='ide' > > > > > > ...we should probably weed 'em out - or at least mention that vertio > > > is the preferred bus (e.g > > > http://ceph.com/docs/master/rbd/libvirt/#summary) > > > > ugh, I thought we had gotten rid of all of those. Good catch. > > BTW, do we still need to use something != virtio in order for > trim/discard? > AFAIK only IDE and virtio-scsi work with/for TRIM and DISCARD. Never mind the sorry state of the kernelspace interface. Christian > sage > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Christian Balzer Network/Systems Engineer chibi at gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/