Hi Jack, thank you! What has reliability to do with rbd_cache = true ? I mean aside of the fact, that if a host powers down, the "flying" data are lost. Are there any special limitations / issues with rbd_cache = true and iscsi tgt ? -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive mailto:info@xxxxxxxxxxxxxxxxx Anschrift: IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3 63571 Gelnhausen HRB 93402 beim Amtsgericht Hanau Geschäftsführung: Oliver Dzombic Steuer Nr.: 35 236 3622 1 UST ID: DE274086107 Am 11.07.2016 um 22:24 schrieb Jake Young: > I'm using this setup with ESXi 5.1 and I get very good performance. I > suspect you have other issues. Reliability is another story (see Nick's > posts on tgt and HA to get an idea of the awful problems you can have), > but for my test labs the risk is acceptable. > > > One change I found helpful is to run tgtd with 128 threads. I'm running > Ubuntu 14.04, so I editted my /etc/init.tgt.conf file and changed the > line that read: > > exec tgtd > > to > > exec tgtd --nr_iothreads=128 > > > If you're not concerned with reliability, you can enhance throughput > even more by enabling rbd client write-back cache in your tgt VM's > ceph.conf file (you'll need to restart tgtd for this to take effect): > > [client] > rbd_cache = true > rbd_cache_size = 67108864 # (64MB) > rbd_cache_max_dirty = 50331648 # (48MB) > rbd_cache_target_dirty = 33554432 # (32MB) > rbd_cache_max_dirty_age = 2 > rbd_cache_writethrough_until_flush = false > > > > > Here's a sample targets.conf: > > <target iqn.2014-04.tgt.Charter> > initiator-address ALL > scsi_sn Charter > #vendor_id CEPH > #controller_tid 1 > write-cache on > read-cache on > driver iscsi > bs-type rbd > <backing-store charter/vmguest> > lun 5 > scsi_id cfe1000c4a71e700506357 > </backing-store> > <backing-store charter/voting> > lun 6 > scsi_id cfe1000c4a71e700507157 > </backing-store> > <backing-store charter/oradata> > lun 7 > scsi_id cfe1000c4a71e70050da7a > </backing-store> > <backing-store charter/oraback> > lun 8 > scsi_id cfe1000c4a71e70050bac0 > </backing-store> > </target> > > > > I don't have FIO numbers handy, but I have some oracle calibrate io > output. > > We're running Oracle RAC database servers in linux VMs on ESXi 5.1, > which use iSCSI to connect to the tgt service. I only have a single > connection setup in ESXi for each LUN. I tested using multipathing and > two tgt VMs presenting identical LUNs/RBD disks, but found that there > wasn't a significant performance gain by doing this, even with > round-robin path selecting in VMware. > > > These tests were run from two RAC VMs, each on a different host, with > both hosts connected to the same tgt instance. The way we have oracle > configured, it would have been using two of the LUNs heavily during this > calibrate IO test. > > > This output is with 128 threads in tgtd and rbd client cache enabled: > > START_TIME END_TIME MAX_IOPS MAX_MBPS MAX_PMBPS LATENCY DISKS > -------------------- -------------------- ---------- ---------- ---------- ---------- ---------- > 28-JUN-016 15:10:50 28-JUN-016 15:20:04 14153 658 412 14 75 > > > This output is with the same configuration, but with rbd client cache > disabled: > > START_TIME END_TIME MAX_IOPS MAX_MBPS MAX_PMBPS LATENCY DISKS > -------------------- -------------------- ---------- ---------- ---------- ---------- ---------- > 28-JUN-016 22:44:29 28-JUN-016 22:49:05 7449 161 219 20 75 > > This output is from a directly connected EMC VNX5100 FC SAN with 25 > disks using dual 8Gb FC links on a different lab system: > > START_TIME END_TIME MAX_IOPS MAX_MBPS MAX_PMBPS LATENCY DISKS > -------------------- -------------------- ---------- ---------- ---------- ---------- ---------- > 28-JUN-016 22:11:25 28-JUN-016 22:18:48 6487 299 224 19 75 > > > One of our goals for our Ceph cluster is to replace the EMC SANs. We've > accomplished this performance wise, the next step is to get a plausible > iSCSI HA solution working. I'm very interested in what Mike Christie is > putting together. I'm in the process of vetting the SUSE solution now. > > BTW - The tests were run when we had 75 OSDs, which are all 7200RPM 2TB > HDs, across 9 OSD hosts. We have no SSD journals, instead we have all > the disks setup as single disk RAID1 disk groups with WB cache with > BBU. All OSD hosts have 40Gb networking and the ESXi hosts have 10G. > > Jake > > > On Mon, Jul 11, 2016 at 12:06 PM, Oliver Dzombic <info@xxxxxxxxxxxxxxxxx > <mailto:info@xxxxxxxxxxxxxxxxx>> wrote: > > Hi Mike, > > i was trying: > > https://ceph.com/dev-notes/adding-support-for-rbd-to-stgt/ > > ONE target, from different OSD servers directly, to multiple vmware esxi > servers. > > A config looked like: > > #cat iqn.ceph-cluster_netzlaboranten-storage.conf > > <target iqn.ceph-cluster:vmware-storage> > driver iscsi > bs-type rbd > backing-store rbd/vmware-storage > initiator-address 10.0.0.9 > initiator-address 10.0.0.10 > incominguser vmwaren-storage RPb18P0xAqkAw4M1 > </target> > > > We had 4 OSD servers. Everyone had this config running. > We had 2 vmware servers ( esxi ). > > So we had 4 paths to this vmware-storage RBD object. > > VMware, in the very end, had 8 paths ( 4 path's directly connected to > the specific vmware server ) + 4 paths this specific vmware servers saw > via the other vmware server ). > > There were very big problems with performance. I am talking about < 10 > MB/s. So the customer was not able to use it, so good old nfs is > serving. > > At that time we used ceph hammer, and i think esxi 5.5 the customer was > using, or maybe esxi 6, was somewhere last year the testing. > > -------------------- > > We will make a new attempt now with ceph jewel and esxi 6 and this time > we will manage the vmware servers. > > As soon as we fixed this > > "ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2" > > what i already mailed here to the list is solved, we can start the > testing. > > > -- > Mit freundlichen Gruessen / Best regards > > Oliver Dzombic > IP-Interactive > > mailto:info@xxxxxxxxxxxxxxxxx <mailto:info@xxxxxxxxxxxxxxxxx> > > Anschrift: > > IP Interactive UG ( haftungsbeschraenkt ) > Zum Sonnenberg 1-3 > 63571 Gelnhausen > > HRB 93402 beim Amtsgericht Hanau > Geschäftsführung: Oliver Dzombic > > Steuer Nr.: 35 236 3622 1 <tel:35%20236%203622%201> > UST ID: DE274086107 > > > Am 11.07.2016 um 17:45 schrieb Mike Christie: > > On 07/08/2016 02:22 PM, Oliver Dzombic wrote: > >> Hi, > >> > >> does anyone have experience how to connect vmware with ceph smart ? > >> > >> iSCSI multipath does not really worked well. > > > > Are you trying to export rbd images from multiple iscsi targets at the > > same time or just one target? > > > > For the HA/multiple target setup, I am working on this for Red Hat. We > > plan to release it in RHEL 7.3/RHCS 2.1. SUSE ships something > already as > > someone mentioned. > > > > We just got a large chunk of code in the upstream kernel (it is in the > > block layer maintainer's tree for the next kernel) so it should be > > simple to add COMPARE_AND_WRITE support now. We should be posting krbd > > exclusive lock support in the next couple weeks. > > > > > >> NFS could be, but i think thats just too much layers in between > to have > >> some useable performance. > >> > >> Systems like ScaleIO have developed a vmware addon to talk with it. > >> > >> Is there something similar out there for ceph ? > >> > >> What are you using ? > >> > >> Thank you ! > >> > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com