Re: Ceph-ISCSI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The issue with active/active is the following condition:
client initiator sends write operation to gateway server A
server A does not respond within client timeout
client initiator re-sends failed write operation to gateway server B
client initiator sends another write operation to gateway server C(orB) on the same sector with different data
Server A wakes up and write pending data, which will over-write sector with old data

As Jason mentioned this is an edge condition but pauses challenges on how to deal with this, some approaches:

-increase the timeout of the client failover + implement fencing with a smaller heartbeat timeout.
-implement a distributed operation counter (using a Ceph object or a distributed configuration/dml tool ) so that if server B gets an operation it can detect this was because of server A failing and starts fencing action.
-similar to the above but rely on iSCSI session counters in Microsoft MCS..MPIO does not generate consecutice numbers accross the different session paths.

Maged

On 2017-10-17 12:23, Jorge Pinilla López wrote:

So what I have understood the final sum up was to support MC to be able to Multipath Active/Active

How is that proyect going?

Windows will be able to support it because they have already implemented it client-side but unless ESXi implements it, VMware will only be able to do Active/Passive, am I right?

El 17/10/2017 a las 11:01, Frédéric Nass escribió:
Hi folks,
 
For those who missed it, the fun was here :-) : https://youtu.be/IgpVOOVNJc0?t=3715
 
Frederic.

----- Le 11 Oct 17, à 17:05, Jake Young <jak3kaj@xxxxxxxxx> a écrit :

On Wed, Oct 11, 2017 at 8:57 AM Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
On Wed, Oct 11, 2017 at 6:38 AM, Jorge Pinilla López <jorpilo@xxxxxxxxx> wrote:
As far as I am able to understand there are 2 ways of setting iscsi for ceph

1- using kernel (lrbd) only able on SUSE, CentOS, fedora...
The target_core_rbd approach is only utilized by SUSE (and its derivatives like PetaSAN) as far as I know. This was the initial approach for Red Hat-derived kernels as well until the upstream kernel maintainers indicated that they really do not want a specialized target backend for just krbd. The next attempt was to re-use the existing target_core_iblock to interface with krbd via the kernel's block layer, but that hit similar upstream walls trying to get support for SCSI command passthrough to the block layer.
 
2- using userspace (tcmu , ceph-iscsi-conf, ceph-iscsi-cli)
The TCMU approach is what upstream and Red Hat-derived kernels will support going forward. 
 
The lrbd project was developed by SUSE to assist with configuring a cluster of iSCSI gateways via the cli.  The ceph-iscsi-config + ceph-iscsi-cli projects are similar in goal but take a slightly different approach. ceph-iscsi-config provides a set of common Python libraries that can be re-used by ceph-iscsi-cli and ceph-ansible for deploying and configuring the gateway. The ceph-iscsi-cli project provides the gwcli tool which acts as a cluster-aware replacement for targetcli.

I don't know which one is better, I am seeing that oficial support is pointing to tcmu but i havent done any testbench.
We (upstream Ceph) provide documentation for the TCMU approach because that is what is available against generic upstream kernels (starting with 4.14 when it's out). Since it uses librbd (which still needs to undergo some performance improvements) instead of krbd, we know that librbd 4k IO performance is slower compared to krbd, but 64k and 128k IO performance is comparable. However, I think most iSCSI tuning guides would already tell you to use larger block sizes (i.e. 64K NTFS blocks or 32K-128K ESX blocks).
 
Does anyone tried both? Do they give the same output? Are both able to manage multiple iscsi targets mapped to a single rbd disk?
Assuming you mean multiple portals mapped to the same RBD disk, the answer is yes, both approaches should support ALUA. The ceph-iscsi-config tooling will only configure Active/Passive because we believe there are certain edge conditions that could result in data corruption if configured for Active/Active ALUA.

The TCMU approach also does not currently support SCSI persistent reservation groups (needed for Windows clustering) because that support isn't available in the upstream kernel. The SUSE kernel has an approach that utilizes two round-trips to the OSDs for each IO to simulate PGR support. Earlier this summer I believe SUSE started to look into how to get generic PGR support merged into the upstream kernel using corosync/dlm to synchronize the states between multiple nodes in the target. I am not sure of the current state of that work, but it would benefit all LIO targets when complete.
 
I will try to make my own testing but if anyone has tried in advance it would be really helpful.


Jorge Pinilla López
jorpilo@xxxxxxxxx


Libre de virus. www.avast.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Jason
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Thanks Jason!
 
You should cut and paste that answer into a blog post on ceph.com. It is a great summary of where things stand. 
 
Jake
 
 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--

Jorge Pinilla López
jorpilo@xxxxxxxxx
Estudiante de ingenieria informática
Becario del area de sistemas (SICUZ)
Universidad de Zaragoza
PGP-KeyID: A34331932EBC715A


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 

 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux