On 04/10/2017 01:21 PM, Timofey Titovets wrote: > JFYI: Today we get totaly stable setup Ceph + ESXi "without hacks" and > this pass stress tests. > > 1. Don't try pass RBD directly to LIO, this setup are unstable > 2. Instead of that, use Qemu + KVM (i use proxmox for that create VM) > 3. Attach RBD to VM as VIRTIO-SCSI disk (must be exported by target_core_iblock) I think you avoid the hung command problem, because lio uses the local/initiator side scsi layer to send commands to the virtio-scsi device which has timeouts similar to ESX. They will timeout and fire the virtio-scsi error handler, and commands will not just hang. I think you can now do something similar with Ilya's patch and use krbd directly with target_core_iblock: https://www.spinics.net/lists/ceph-devel/msg35618.html > 4. Make a LIO Target in VM > 4.1 Sync Iniciator (ESXi) and target (LIO) options (best change Target options) > 4.2 You can enable almost all VAAI (also emulate_tpu=1, emulate_tpws=1) > 4.3 For performance reason use noop on RBD disk in VM and set > is_nonrot=1 (disable ESXi sheduller) > 5. ESXi are "stupid" and have a problem with CAS on LIO (and some > other storage vendors (google for info)), so for stable working > without disconects of LUN set VMFS3.UseATSForHBOnVMFS5 to ZERO on All > ESXi that use this lun. > 6. Don't try make Target HA (not tested but i think you will catch > problems with VMFS), you must do something like VM HA for that. > Yes, the problem is for HA where commands need to be cleaned up before they are retried through different GWs/paths, so one command is not racing with the retry and new commands. > This setup tested with latest ESXi and VMFS6. > > Thanks. > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com