Re: [PATCH] iscsi: do not fail to stop a stopped pool

John Ferlan <jferlan@xxxxxxxxxx> · Thu, 30 Apr 2015 10:40:50 -0400

On 04/30/2015 06:59 AM, Ján Tomko wrote:
> On Wed, Apr 29, 2015 at 01:03:29PM -0400, John Ferlan wrote:
>>
>>
>> On 04/29/2015 10:40 AM, Ján Tomko wrote:
>>> On Wed, Apr 29, 2015 at 10:10:11AM -0400, John Ferlan wrote:
>>>>
>>>>
>>>> On 04/29/2015 09:08 AM, Ján Tomko wrote:
>>>>> Just as we allow stopping filesystem pools when they were unmounted
>>>>> externally, do not fail to stop an iscsi pool when someone else
>>>>> closed the session externally.
>>>>>
>>>>> Resolves:
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1171984
>>>>
>>>> For this I disagree - it doesn't resolve all the issues in 1171984.
>>>
>>> I can remove the 'Resolves:' line.
>>>
>>
>> Fair enough
>>
>>>> It
>>>> resolves a symptom of libvirt allowing more than one pool to use the
>>>> same session.
>>>
>>> This resolves the error to stop a pool when there's no pool anymore,
>>> whether that's because someone called StopPool earlier on a pool that
>>> was duplicate but libvirt didn't catch it, or manually via messing with
>>> iscsiadm.
>>>
>>
>> I did some more testing of this and found one could start a domain using
>> the pool in the funky state which would cause the session to be logged
>> back in (I have authentication enabled too, so that could be a reason -
>> I didn't chase deeper)....
>>
> 
> I don't see the qemu driver doing any iscsi-related calls, is this done
> by qemu?
> 

The qemu code has a libiscsi component - I didn't chase down the path to
find the exact code that did the --login.  Because the logic comes with
an audible click on my laptop, I do know it happens... Like a dolphin click.

>> This patch only addresses failure to shutdown the second of the
>> duplicated pools, but trying to refresh the pool gets the same result -
>> that is an error since the session was logged out. However, if someone
>> calls refreshPool and there's a failure because of the logout, then the
>> pool gets marked as inactive.
>>
> 
> Well, the pool was shut down outside of libvirt's knowledge and libvirt
> marks is as inactive as soon as it's asked to refresh it, I think that's
> expected.
> 

Fair enough - it's the only path that uses the 'false' (e.g. probe
argument to elicit the error).

>> The checkPool will be "tricked" into believing the pool isn't started
>> which shouldn't be a huge deal unless pool-autostart is set for the
>> pool. 
> 
> Why tricked? The pool is not started - the session is not there.
> 

In quotes for lack of a better word... It was started, so was a second
one which shouldn't have been. So we have two pools started.  One is
stopped. The other would still be shown as active and is still
considered started, correct?  Pool and volume commands do not fail.
Nothing shut it down explicitly. Then someone/something restarts
libvirtd. Unless the pool has set autostart, the pool that was started
now is stopped. This would seemingly go against the recent series that
ensures started pools remain started after libvirtd restart. And yes,
sure someone could have used iscsiadm in order to logout and we'd be in
the same scenario, but I consider that even more of an edge condition.
You're not supposed to muck with things behind libvirt's back and if you
do, then you get what you get. I didn't check if this same issue would
be true if someone unmounted the path a pool was using, then restarted
libvirtd (eg, the scenario for the v2 of this patch).

>> But I think someone would have expected (based on a recent series)
>> that if a pool was started, that it could survive a 'service libvirtd
>> restart'.
>>
>> The findPoolSources, uploadVol, downloadVol, and wipeVol don't require
>> the session as they go direct to the block device
>>
>>>>
>>>> While there is disagreement over the method I've taken :
>>>>
>>>> http://www.redhat.com/archives/libvir-list/2015-April/msg01197.html
>>>>
>>>> Simply "covering up" the original issue by just ignoring the error on
>>>> stop doesn't seem to be the best solution to me.
>>>>
>>>
>>> The proposed series aims to detect duplicate pools on the same hosts.
>>> It does not deal with duplicate pools on different hosts.
>>> Even if we change the duplicate checks to only deal with the target,
>>> as I suggested here:
>>> https://www.redhat.com/archives/libvir-list/2015-April/msg00959.html
>>> (since libvirt's iscsi backend treats the same target on different hosts
>>> as a duplicate pool, but the check above does not), this patch also
>>> fixes stopping the pool after someone messes with iscsiadm manually,
>>>
>>>
>> As noted in my response, duplicated IQN's on truly different host names
>> (by name or address) is a different bug.  Not that it doesn't deserve to
>> be fixed, but let's take it one step at a time.
>>
> 
> 
>> There's still another issue of how to resolve the "same host" when using
>> different IP address families...
>>
> 
> I don't think that can be done, or that it should be attempted.
> 

The iscsiadm code seems to have figured it out

If I take the virbr0 on which my iSCSI server is using the IP Address
192.168.122.1 and execute "ifconfig virbr0 inet6 add 3ffe::103/64", then
execute:

iscsiadm -t sendtargets -m discovery -p 192.168.122.1

or

iscsiadm -t sendtargets -m discovery -p '[3ffe::103]'

I get the same IQN's output, although different portal paths. I haven't
dug into the iscsi code far enough yet to ascertain how it all works.

I could definitely see/understand that perhaps given the totality of the
issues where failure just because of duplicated IQN is almost a
necessity.  Although it is perhaps possible to determine duplicity by
using some sort of getifaddrs() logic where denial is made if the IPv4
and IPv6 addresses are using the same network interface. Yes, more
complexity.

John

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list