Re: Designing a new prio_callout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks again, Hannes. We really appreciate your time on this.

Stefan's suggestion will be a great option for round-robin failover for our first release. We'll try to figure out a way to do that. It also sounds like using the ALUA callout is going to be the best long-term solution, which answers the original question that I posed.

As for putting failover paths on a different subnet, that will be up to the user to manage. We won't prevent that sort of thing, and network configurations are extremely flexible on our systems. Subnet also doesn't necessarily effect physical network paths with our system.

Again, thanks so much for your help!

On 8/27/07, Hannes Reinecke <hare@xxxxxxx> wrote:
Ethan John wrote:
> For the record, setting rr_min_io to something extremely large (we're using
> 2 billion now, since I'm assuming it's a C integer) solves the immediate
> problem that we're having (overhead in path switching causing poor
> performance). Telling people to use mpath_prio_random is still less than
> ideal for any small number of iSCSI targets, but it a better short-term
> solution for us than nothing.
>
In setting rr_min_io to something extremely large you effectively
disable the round-robin scheduler in multipathing.
That's okay for the failover scenario you have (as you only have
one path per group), but whenever you have more than one path
in a group that wouldn't work anymore.

> On 8/10/07, Ethan John <ethan.john@xxxxxxxxx> wrote:
>> Hannes, thanks again for your help with this.
>>
>> I haven't noticed that failback does the right thing, but I'll try it out
>> again. Could be something we're doing wrong. In any case, there's very
>> little documentation on all this, and I'm trying to develop some kind of
>> strategy for our Linux customers to use until we get ALUA implemented.
>>
>> Being able to set path priorities manually would be ideal, but it seems
>> like this is impossible, right?
>>
>> Here's the situation we have right now. I initiate two connections to one
>> target, across two sessions with two different IPs, with two LUs. Multipath
>> looks like this:
>> mpath45 (20002c9020020001a00151b6b46bb57b0) dm-1 company,iSCSI target
>> [size=15G][features=0][hwhandler=0]
>> \_ round-robin 0 [prio=1][active]
>>  \_ 22:0:0:1 sdc 8:32  [active][ready]
>> \_ round-robin 0 [prio=1][enabled]
>>  \_ 23:0:0:1 sde 8:64  [active][ready]
>> mpath44 (20002c9020020001200151b6b46bb57ae) dm-0 company,iSCSI target
>> [size=15G][features=0][hwhandler=0]
>> \_ round-robin 0 [prio=1][enabled]
>>  \_ 22:0:0:0 sdb 8:16  [active][ready]
>> \_ round-robin 0 [prio=1][enabled]
>>  \_ 23:0:0:0 sdd 8:48  [active][ready]
>>
>> Note that there are only two active sessions:
>> # iscsiadm -m session
>> tcp: [20] 10.53.152.22:3260 ,1 iqn.2001-07.com.company:qaiscsi2:blah1
>> tcp: [21] 10.53.152.23:3260,2 iqn.2001-07.com.company:qaiscsi2:blah1
>>
>> So the result is that all activity is routed to the first session that was
>> initiated. I want to change the priorities of the paths to allow for traffic
>> to go to the first IP for mpath45 and the second IP for mpath46.
>>
That's a matter of the IP routing. Having both target on the same (sub-) net
doesn't work very well with multipathing. Please setup your system with
each iSCSI Target port in a different subnet eg

10.53.152.22:3260,1 iqn.2001-07.com.company:qaiscsi2:blah1
10.53.153.22:3260,2 iqn.2001-07.com.company:qaiscsi2:blah1

then you'll have one iSCSI target port per subnet and you can actually
do failover etc.

>> Obviously ALUA is the way to go for this in the future, but we won't have
>> the resources to implement that, so I'm looking for an interim solution that
>> will scale to thousands of clients. Right now, the only thing I can tell
>> people is to manually initiate connections to certain targets through
>> certain IP addresses -- basically, doing the load balancing themselves. Is
>> there a better way?
>>
No, not really. But I'm not a network guru. You may want to ask on
the open-iscsi mailing list.

And you can get all information you need via sysfs, so it should
be possible to create a script like Stefan Bader suggested.

Cheers,

Hannes
--
Dr. Hannes Reinecke                   zSeries & Storage
hare@xxxxxxx                          +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel



--
Ethan John
http://www.flickr.com/photos/thaen/
(206) 841.4157
--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux