Re: rbd iscsi gateway question

Mike Christie <mchristi@xxxxxxxxxx> · Mon, 10 Apr 2017 15:31:33 -0500

On 04/10/2017 01:21 PM, Timofey Titovets wrote:
> JFYI: Today we get totaly stable setup Ceph + ESXi "without hacks" and
> this pass stress tests.
> 
> 1. Don't try pass RBD directly to LIO, this setup are unstable
> 2. Instead of that, use Qemu + KVM (i use proxmox for that create VM)
> 3. Attach RBD to VM as VIRTIO-SCSI disk (must be exported by target_core_iblock)

I think you avoid the hung command problem, because lio uses the
local/initiator side scsi layer to send commands to the virtio-scsi
device which has timeouts similar to ESX. They will timeout and fire the
virtio-scsi error handler, and commands will not just hang.

I think you can now do something similar with Ilya's patch and use krbd
directly with target_core_iblock:

https://www.spinics.net/lists/ceph-devel/msg35618.html

> 4. Make a LIO Target in VM
> 4.1 Sync Iniciator (ESXi) and target (LIO) options (best change Target options)
> 4.2 You can enable almost all VAAI (also emulate_tpu=1, emulate_tpws=1)
> 4.3 For performance reason use noop on RBD disk in VM and set
> is_nonrot=1 (disable ESXi sheduller)
> 5. ESXi are "stupid" and have a problem with CAS on LIO (and some
> other storage vendors (google for info)), so for stable working
> without disconects of LUN set VMFS3.UseATSForHBOnVMFS5 to ZERO on All
> ESXi that use this lun.
> 6. Don't try make Target HA (not tested but i think you will catch
> problems with VMFS), you must do something like VM HA for that.
> 

Yes, the problem is for HA where commands need to be cleaned up before
they are retried through different GWs/paths, so one command is not
racing with the retry and new commands.

> This setup tested with latest ESXi and VMFS6.
> 
> Thanks.
> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com