Re: ceph + vmware

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I had some odd issues like that due to MTU mismatch. 

Keep in mind that the vSwitch and vmkernel port have independent MTU settings.  Verify you can ping with large size packets without fragmentation between your host and iscsi target. 

If that's not it, you can try to disable VAAI options to see if one of them is causing issues. I haven't used ESXi 6.0 yet. 

Jake


On Friday, July 15, 2016, Oliver Dzombic <info@xxxxxxxxxxxxxxxxx> wrote:
Hi,

i am currently trying out the stuff.

My tgt config:

# cat tgtd.conf
# The default config file
include /etc/tgt/targets.conf

# Config files from other packages etc.
include /etc/tgt/conf.d/*.conf

nr_iothreads=128


-----

# cat iqn.2016-07.tgt.esxi-test.conf
<target iqn.2016-07.tgt.esxi-test>
  initiator-address ALL
  scsi_sn esxi-test
  #vendor_id CEPH
  #controller_tid 1
  write-cache on
  read-cache on
  driver iscsi
  bs-type rbd
  <backing-store vmware1/esxi-test>
  lun 1
  scsi_id cf10000c4a71e700506357
  </backing-store>
  </target>


--------------


If i create a vm inside esxi 6 and try to format the virtual hdd, i see
in logs:

sd:2:0:0:0: [sda] CDB:
Write(10): 2a 00 0f 86 a8 80 00 01 40 00
mptscsih: ioc0: task abort: SUCCESS (rv=2002) (sc=ffff880068aa5e00)
mptscsih: ioc0: attempting task abort! ( sc=ffff880068aa4a80)

With the LSI HDD emulation. With the vmware paravirtualization
everything just freeze.

Any idea with that issue ?

--
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
IP-Interactive

mailto:info@xxxxxxxxxxxxxxxxx

Anschrift:

IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 93402 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic

Steuer Nr.: 35 236 3622 1
UST ID: DE274086107


Am 11.07.2016 um 22:24 schrieb Jake Young:
> I'm using this setup with ESXi 5.1 and I get very good performance.  I
> suspect you have other issues.  Reliability is another story (see Nick's
> posts on tgt and HA to get an idea of the awful problems you can have),
> but for my test labs the risk is acceptable.
>
>
> One change I found helpful is to run tgtd with 128 threads.  I'm running
> Ubuntu 14.04, so I editted my /etc/init.tgt.conf file and changed the
> line that read:
>
> exec tgtd
>
> to
>
> exec tgtd --nr_iothreads=128
>
>
> If you're not concerned with reliability, you can enhance throughput
> even more by enabling rbd client write-back cache in your tgt VM's
> ceph.conf file (you'll need to restart tgtd for this to take effect):
>
> [client]
> rbd_cache = true
> rbd_cache_size = 67108864 # (64MB)
> rbd_cache_max_dirty = 50331648 # (48MB)
> rbd_cache_target_dirty = 33554432 # (32MB)
> rbd_cache_max_dirty_age = 2
> rbd_cache_writethrough_until_flush = false
>
>
>
>
> Here's a sample targets.conf:
>
>   <target iqn.2014-04.tgt.Charter>
>   initiator-address ALL
>   scsi_sn Charter
>   #vendor_id CEPH
>   #controller_tid 1
>   write-cache on
>   read-cache on
>   driver iscsi
>   bs-type rbd
>   <backing-store charter/vmguest>
>   lun 5
>   scsi_id cfe1000c4a71e700506357
>   </backing-store>
>   <backing-store charter/voting>
>   lun 6
>   scsi_id cfe1000c4a71e700507157
>   </backing-store>
>   <backing-store charter/oradata>
>   lun 7
>   scsi_id cfe1000c4a71e70050da7a
>   </backing-store>
>   <backing-store charter/oraback>
>   lun 8
>   scsi_id cfe1000c4a71e70050bac0
>   </backing-store>
>   </target>
>
>
>
> I don't have FIO numbers handy, but I have some oracle calibrate io
> output.
>
> We're running Oracle RAC database servers in linux VMs on ESXi 5.1,
> which use iSCSI to connect to the tgt service.  I only have a single
> connection setup in ESXi for each LUN.  I tested using multipathing and
> two tgt VMs presenting identical LUNs/RBD disks, but found that there
> wasn't a significant performance gain by doing this, even with
> round-robin path selecting in VMware.
>
>
> These tests were run from two RAC VMs, each on a different host, with
> both hosts connected to the same tgt instance.  The way we have oracle
> configured, it would have been using two of the LUNs heavily during this
> calibrate IO test.
>
>
> This output is with 128 threads in tgtd and rbd client cache enabled:
>
> START_TIME           END_TIME               MAX_IOPS   MAX_MBPS  MAX_PMBPS   LATENCY       DISKS
> -------------------- -------------------- ---------- ---------- ---------- ---------- ----------
> 28-JUN-016 15:10:50  28-JUN-016 15:20:04       14153        658        412       14          75
>
>
> This output is with the same configuration, but with rbd client cache
> disabled:
>
> START_TIME         END_TIME            MAX_IOPS   MAX_MBPS  MAX_PMBPS    LATENCY       DISKS
> -------------------- -------------------- ---------- ---------- ---------- ---------- ----------
> 28-JUN-016 22:44:29  28-JUN-016 22:49:05    7449        161        219       20          75
>
> This output is from a directly connected EMC VNX5100 FC SAN with 25
> disks using dual 8Gb FC links on a different lab system:
>
> START_TIME         END_TIME            MAX_IOPS   MAX_MBPS  MAX_PMBPS    LATENCY       DISKS
> -------------------- -------------------- ---------- ---------- ---------- ---------- ----------
> 28-JUN-016 22:11:25  28-JUN-016 22:18:48    6487        299        224       19          75
>
>
> One of our goals for our Ceph cluster is to replace the EMC SANs.  We've
> accomplished this performance wise, the next step is to get a plausible
> iSCSI HA solution working.  I'm very interested in what Mike Christie is
> putting together.  I'm in the process of vetting the SUSE solution now.
>
> BTW - The tests were run when we had 75 OSDs, which are all 7200RPM 2TB
> HDs, across 9 OSD hosts.  We have no SSD journals, instead we have all
> the disks setup as single disk RAID1 disk groups with WB cache with
> BBU.  All OSD hosts have 40Gb networking and the ESXi hosts have 10G.
>
> Jake
>
>
> On Mon, Jul 11, 2016 at 12:06 PM, Oliver Dzombic <info@xxxxxxxxxxxxxxxxx
> <mailto:info@xxxxxxxxxxxxxxxxx>> wrote:
>
>     Hi Mike,
>
>     i was trying:
>
>     https://ceph.com/dev-notes/adding-support-for-rbd-to-stgt/
>
>     ONE target, from different OSD servers directly, to multiple vmware esxi
>     servers.
>
>     A config looked like:
>
>     #cat iqn.ceph-cluster_netzlaboranten-storage.conf
>
>     <target iqn.ceph-cluster:vmware-storage>
>     driver iscsi
>     bs-type rbd
>     backing-store rbd/vmware-storage
>     initiator-address 10.0.0.9
>     initiator-address 10.0.0.10
>     incominguser vmwaren-storage RPb18P0xAqkAw4M1
>     </target>
>
>
>     We had 4 OSD servers. Everyone had this config running.
>     We had 2 vmware servers ( esxi ).
>
>     So we had 4 paths to this vmware-storage RBD object.
>
>     VMware, in the very end, had 8 paths ( 4 path's directly connected to
>     the specific vmware server ) + 4 paths this specific vmware servers saw
>     via the other vmware server ).
>
>     There were very big problems with performance. I am talking about < 10
>     MB/s. So the customer was not able to use it, so good old nfs is
>     serving.
>
>     At that time we used ceph hammer, and i think esxi 5.5 the customer was
>     using, or maybe esxi 6, was somewhere last year the testing.
>
>     --------------------
>
>     We will make a new attempt now with ceph jewel and esxi 6 and this time
>     we will manage the vmware servers.
>
>     As soon as we fixed this
>
>     "ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2"
>
>     what i already mailed here to the list is solved, we can start the
>     testing.
>
>
>     --
>     Mit freundlichen Gruessen / Best regards
>
>     Oliver Dzombic
>     IP-Interactive
>
>     mailto:info@xxxxxxxxxxxxxxxxx <mailto:info@xxxxxxxxxxxxxxxxx>
>
>     Anschrift:
>
>     IP Interactive UG ( haftungsbeschraenkt )
>     Zum Sonnenberg 1-3
>     63571 Gelnhausen
>
>     HRB 93402 beim Amtsgericht Hanau
>     Geschäftsführung: Oliver Dzombic
>
>     Steuer Nr.: 35 236 3622 1 <tel:35%20236%203622%201>
>     UST ID: DE274086107
>
>
>     Am 11.07.2016 um 17:45 schrieb Mike Christie:
>     > On 07/08/2016 02:22 PM, Oliver Dzombic wrote:
>     >> Hi,
>     >>
>     >> does anyone have experience how to connect vmware with ceph smart ?
>     >>
>     >> iSCSI multipath does not really worked well.
>     >
>     > Are you trying to export rbd images from multiple iscsi targets at the
>     > same time or just one target?
>     >
>     > For the HA/multiple target setup, I am working on this for Red Hat. We
>     > plan to release it in RHEL 7.3/RHCS 2.1. SUSE ships something
>     already as
>     > someone mentioned.
>     >
>     > We just got a large chunk of code in the upstream kernel (it is in the
>     > block layer maintainer's tree for the next kernel) so it should be
>     > simple to add COMPARE_AND_WRITE support now. We should be posting krbd
>     > exclusive lock support in the next couple weeks.
>     >
>     >
>     >> NFS could be, but i think thats just too much layers in between
>     to have
>     >> some useable performance.
>     >>
>     >> Systems like ScaleIO have developed a vmware addon to talk with it.
>     >>
>     >> Is there something similar out there for ceph ?
>     >>
>     >> What are you using ?
>     >>
>     >> Thank you !
>     >>
>     >
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux