Re: Ceph-ISCSI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 12, 2017 at 5:02 AM, Maged Mokhtar <mmokhtar@xxxxxxxxxxx> wrote:

On 2017-10-11 14:57, Jason Dillaman wrote:

On Wed, Oct 11, 2017 at 6:38 AM, Jorge Pinilla López <jorpilo@xxxxxxxxx> wrote:
As far as I am able to understand there are 2 ways of setting iscsi for ceph

1- using kernel (lrbd) only able on SUSE, CentOS, fedora...
 
The target_core_rbd approach is only utilized by SUSE (and its derivatives like PetaSAN) as far as I know. This was the initial approach for Red Hat-derived kernels as well until the upstream kernel maintainers indicated that they really do not want a specialized target backend for just krbd. The next attempt was to re-use the existing target_core_iblock to interface with krbd via the kernel's block layer, but that hit similar upstream walls trying to get support for SCSI command passthrough to the block layer.
 
2- using userspace (tcmu , ceph-iscsi-conf, ceph-iscsi-cli)
 
The TCMU approach is what upstream and Red Hat-derived kernels will support going forward. 
 
The lrbd project was developed by SUSE to assist with configuring a cluster of iSCSI gateways via the cli.  The ceph-iscsi-config + ceph-iscsi-cli projects are similar in goal but take a slightly different approach. ceph-iscsi-config provides a set of common Python libraries that can be re-used by ceph-iscsi-cli and ceph-ansible for deploying and configuring the gateway. The ceph-iscsi-cli project provides the gwcli tool which acts as a cluster-aware replacement for targetcli.
 
I don't know which one is better, I am seeing that oficial support is pointing to tcmu but i havent done any testbench.
 
We (upstream Ceph) provide documentation for the TCMU approach because that is what is available against generic upstream kernels (starting with 4.14 when it's out). Since it uses librbd (which still needs to undergo some performance improvements) instead of krbd, we know that librbd 4k IO performance is slower compared to krbd, but 64k and 128k IO performance is comparable. However, I think most iSCSI tuning guides would already tell you to use larger block sizes (i.e. 64K NTFS blocks or 32K-128K ESX blocks).
 
Does anyone tried both? Do they give the same output? Are both able to manage multiple iscsi targets mapped to a single rbd disk?
 
Assuming you mean multiple portals mapped to the same RBD disk, the answer is yes, both approaches should support ALUA. The ceph-iscsi-config tooling will only configure Active/Passive because we believe there are certain edge conditions that could result in data corruption if configured for Active/Active ALUA.
 
The TCMU approach also does not currently support SCSI persistent reservation groups (needed for Windows clustering) because that support isn't available in the upstream kernel. The SUSE kernel has an approach that utilizes two round-trips to the OSDs for each IO to simulate PGR support. Earlier this summer I believe SUSE started to look into how to get generic PGR support merged into the upstream kernel using corosync/dlm to synchronize the states between multiple nodes in the target. I am not sure of the current state of that work, but it would benefit all LIO targets when complete.
 
I will try to make my own testing but if anyone has tried in advance it would be really helpful.


Jorge Pinilla López
jorpilo@xxxxxxxxx


Libre de virus. www.avast.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 
--
Jason

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Hi Jason,

Similar to TCMU user space backstore approach, i would prefer cluster sync of PR and other task management be done user space. It really does not belong in the kernel and will give more flexibility in implementation. A user space PR get/set interface could be implemented via:

-corosync
-Writing PR metada to Ceph / network share
-Use Ceph watch/notify

Also in the future it may be beneficial to build/extend on Ceph features such as exclusive locks and paxos based leader election for applications such as iSCSI gateways to use for resource distribution and fail over as an alternative to Pacemaker which has sociability limits.

Maged


I would definitely love to eventually see a plugable TCMU interface for distributing PGRs, port states, and any other shared state to avoid the need to configure a separate corosync cluster. I think the reason why the corosync/DLM approach is a very attractive starting point is because this logic already exists in SCST and can be ported to LIO (or at least used as a blueprint).

--
Jason
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux