Poor RBD performance as LIO iSCSI target

Christopher Spearman <neromaverick@xxxxxxxxx> · Mon, 27 Oct 2014 14:24:04 -0700

I've noticed a pretty steep performance degradation when using RBDs with
 LIO. I've tried a multitude of configurations to see if there are any 
changes in performance and I've only found a few that work (sort of).

Details about the systems being used:

 -
 All network hardware for data is 10gbe, there is some management on 
1gbe, but I can assure that it isn't being used (perf & bwm-ng shows
 this)
 - Ceph version 0.80.5
 - 20GB RBD (for our test, prod will be much larger, the size doesn't seem to matter tho)
 - LIO version 4.1.0, RisingTide
 - Initiator is another linux system (However I've used ESXi as well with no difference)
 - We have 8 OSD nodes, each with 8 2TB OSDs, 64 OSDs total
   * 4 nodes are in one rack 4 in another, crush maps have been configured with this as well
   * All OSD nodes are running Centos 6.5
 - 2 Gateway nodes on HP Proliant blades (but I've only been using one for testing, however the problem does exist on both)
   * All gateway nodes are running Centos 7

I've tested a multitude of things, mainly to see where the issue lies.

 - The performance of the RBD as a target using LIO
 - The performance of the RBD itself (no iSCSI or LIO)
 - LIO performance by using a ramdisk as a target (no RBD involved)
 - Setting the RBD up with LVM, then using a logical volume from that as a target with LIO
 - Setting the RBD up in RAID0 & RAID1 (single disk, using mdadm), then using that volume as a target with LIO
 - Mounting the RBD as ext4, then using a disk image and fileio as a target
 - Mounting the RBD as ext4, then using a disk image as a loop device and blockio as a target
 - Setting the RBD up as a loop device, then setting that up as a target with LIO

 - What tested with bad performance (Reads ~25-50MB/s - Writes ~25-50MB/s)
   * RBD setup as target using LIO
   * RBD -> LVM -> LIO target
   * RBD -> RAID0/1 -> LIO target
 - What tested with good performance (Reads ~700-800MB/s - Writes ~400-700MB/s) 
   * RBD on local system, no iSCSI
   * Ramdisk (No RBD) -> LIO target
   * RBD -> Mounted ext4 -> disk image -> LIO fileio target
   * RBD -> Mounted ext4 -> disk image -> loop device -> LIO blockio target
   * RBD -> loop device -> LIO target

I'm
 just curious if anybody else has experienced these issues or has any 
idea what's going on or has any suggestions on fixing this. I know using
 loop devices sounds like a solution, but we hit a brick wall with the 
fact loop devices are single threaded. The intent is to use this with 
VMWare ESXi with the 2 gateways setup as a path to the target block 
devices. I'm not opposed to using something somewhat kludgy, provided we
 can still use multipath iSCSI within VMWare

Thanks for any help anyone can provide!
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com