I've noticed a pretty steep performance degradation when using RBDs with
LIO. I've tried a multitude of configurations to see if there are any
changes in performance and I've only found a few that work (sort of).
Details about the systems being used:
- All network hardware for data is 10gbe, there is some management on 1gbe, but I can assure that it isn't being used (perf & bwm-ng shows this)
- Ceph version 0.80.5
- 20GB RBD (for our test, prod will be much larger, the size doesn't seem to matter tho)
- LIO version 4.1.0, RisingTide
- Initiator is another linux system (However I've used ESXi as well with no difference)
- We have 8 OSD nodes, each with 8 2TB OSDs, 64 OSDs total
* 4 nodes are in one rack 4 in another, crush maps have been configured with this as well
* All OSD nodes are running Centos 6.5
- 2 Gateway nodes on HP Proliant blades (but I've only been using one for testing, however the problem does exist on both)
* All gateway nodes are running Centos 7
I've tested a multitude of things, mainly to see where the issue lies.
- The performance of the RBD as a target using LIO
- The performance of the RBD itself (no iSCSI or LIO)
- LIO performance by using a ramdisk as a target (no RBD involved)
- Setting the RBD up with LVM, then using a logical volume from that as a target with LIO
- Setting the RBD up in RAID0 & RAID1 (single disk, using mdadm), then using that volume as a target with LIO
- Mounting the RBD as ext4, then using a disk image and fileio as a target
- Mounting the RBD as ext4, then using a disk image as a loop device and blockio as a target
- Setting the RBD up as a loop device, then setting that up as a target with LIO
- What tested with bad performance (Reads ~25-50MB/s - Writes ~25-50MB/s)
* RBD setup as target using LIO
* RBD -> LVM -> LIO target
* RBD -> RAID0/1 -> LIO target
- What tested with good performance (Reads ~700-800MB/s - Writes ~400-700MB/s)
* RBD on local system, no iSCSI
* Ramdisk (No RBD) -> LIO target
* RBD -> Mounted ext4 -> disk image -> LIO fileio target
* RBD -> Mounted ext4 -> disk image -> loop device -> LIO blockio target
* RBD -> loop device -> LIO target
I'm just curious if anybody else has experienced these issues or has any idea what's going on or has any suggestions on fixing this. I know using loop devices sounds like a solution, but we hit a brick wall with the fact loop devices are single threaded. The intent is to use this with VMWare ESXi with the 2 gateways setup as a path to the target block devices. I'm not opposed to using something somewhat kludgy, provided we can still use multipath iSCSI within VMWare
Thanks for any help anyone can provide!
Details about the systems being used:
- All network hardware for data is 10gbe, there is some management on 1gbe, but I can assure that it isn't being used (perf & bwm-ng shows this)
- Ceph version 0.80.5
- 20GB RBD (for our test, prod will be much larger, the size doesn't seem to matter tho)
- LIO version 4.1.0, RisingTide
- Initiator is another linux system (However I've used ESXi as well with no difference)
- We have 8 OSD nodes, each with 8 2TB OSDs, 64 OSDs total
* 4 nodes are in one rack 4 in another, crush maps have been configured with this as well
* All OSD nodes are running Centos 6.5
- 2 Gateway nodes on HP Proliant blades (but I've only been using one for testing, however the problem does exist on both)
* All gateway nodes are running Centos 7
I've tested a multitude of things, mainly to see where the issue lies.
- The performance of the RBD as a target using LIO
- The performance of the RBD itself (no iSCSI or LIO)
- LIO performance by using a ramdisk as a target (no RBD involved)
- Setting the RBD up with LVM, then using a logical volume from that as a target with LIO
- Setting the RBD up in RAID0 & RAID1 (single disk, using mdadm), then using that volume as a target with LIO
- Mounting the RBD as ext4, then using a disk image and fileio as a target
- Mounting the RBD as ext4, then using a disk image as a loop device and blockio as a target
- Setting the RBD up as a loop device, then setting that up as a target with LIO
- What tested with bad performance (Reads ~25-50MB/s - Writes ~25-50MB/s)
* RBD setup as target using LIO
* RBD -> LVM -> LIO target
* RBD -> RAID0/1 -> LIO target
- What tested with good performance (Reads ~700-800MB/s - Writes ~400-700MB/s)
* RBD on local system, no iSCSI
* Ramdisk (No RBD) -> LIO target
* RBD -> Mounted ext4 -> disk image -> LIO fileio target
* RBD -> Mounted ext4 -> disk image -> loop device -> LIO blockio target
* RBD -> loop device -> LIO target
I'm just curious if anybody else has experienced these issues or has any idea what's going on or has any suggestions on fixing this. I know using loop devices sounds like a solution, but we hit a brick wall with the fact loop devices are single threaded. The intent is to use this with VMWare ESXi with the 2 gateways setup as a path to the target block devices. I'm not opposed to using something somewhat kludgy, provided we can still use multipath iSCSI within VMWare
Thanks for any help anyone can provide!
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com