Re: [PATCH] multipathd: the sysfs prioritizer can return stale data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> 
> On Jan 30, 2024, at 9:57 AM, Martin Wilck <mwilck@xxxxxxxx> wrote:
> 
> On Tue, 2024-01-30 at 11:41 -0500, Benjamin Marzinski wrote:
>> On Tue, Jan 23, 2024 at 06:00:07PM -0800, Brian Bunker wrote:
>>> When a path is lost and then reinstated later, the ALUA
>>> device handler will not pick up this change and continue
>>> to possibly provide incorrect (stale) information about
>>> its ALUA state to the 'sysfs' prioritizer if the path's
>>> priorities were changed prior to the loss of those paths.
>>> 
>>> On the loss of an I_T nexus, the path state should not
>>> continue to use its last known state. Many things and a lot
>>> of time could have passed between the path loss and its
>>> eventual reinstatemnt.
>>> 
>>> The ALUA device handler didn't have this issue since it
>>> always got the ALUA state from the target by sending an RTPG
>>> request on the path and updated the priority. However with
>>> the detect priority set to true by default, the 'sysfs'
>>> prioritzier will be used on targets that support ALUA rather
>>> than the 'alua' prioritizer.
>>> 
>>> Without re-evaluating the ALUA state when a path returns
>>> multipath is left with path states which do not reflect
>>> the actual ALUA states the target is providing.
>>> 
>>> 3624a9370f0f545fc7c3e46a100011010 dm-2 PURE,FlashArray
>>> size=3.0T features='0' hwhandler='1 alua' wp=rw
>>>> -+- policy='service-time 0' prio=50 status=active
>>>>> - 13:0:0:1 sdg 8:96  active ready running
>>>>> - 12:0:0:1 sdf 8:80  active ready running
>>>>> - 9:0:0:1  sdc 8:32  active ready running
>>>> `- 1:0:0:1  sdb 8:16  active ready running
>>> `-+- policy='service-time 0' prio=10 status=enabled
>>>   |- 14:0:0:1 sdh 8:112 active ready running
>>>   |- 15:0:0:1 sdi 8:128 active ready running
>>>   |- 10:0:0:1 sdd 8:48  active ready running
>>>   `- 11:0:0:1 sde 8:64  active ready running
>>> 
>>>  # sg_rtpg /dev/sdh (Active/Optimized)
>>> target port group asymmetric access state : 0x00
>>> 
>>> Signed-off-by: Brian Bunker <brian@xxxxxxxxxxxxxxx>
>>> Signed-off-by: Seamus Connor <sconnor@xxxxxxxxxxxxxxx>
>>> ---
>>>  multipathd/main.c | 5 +++++
>>>  1 file changed, 5 insertions(+)
>>> 
>>> diff --git a/multipathd/main.c b/multipathd/main.c
>>> index 230c9d10..dd48be74 100644
>>> --- a/multipathd/main.c
>>> +++ b/multipathd/main.c
>>> @@ -1937,6 +1937,11 @@ reinstate_path (struct path * pp)
>>>   else {
>>>   condlog(2, "%s: reinstated", pp->dev_t);
>>>   update_queue_mode_add_path(pp->mpp);
>>> + if (strcmp(pp->prio.name, PRIO_SYSFS) == 0) {
>>> + condlog(2, "%s: rescan target to update
>>> priorities",
>>> + pp->dev_t);
>>> + rescan_path(pp->udev);
>> 
>> I can see why this is necessary with the sysfs target, but AFAICT
>> rescanning the scsi device will end up calling scsi_execute_cmd to
>> update the vpd pages, and this can block.  Obviously, we are doing
>> this
>> after the checker has verified that the path is up, but even still,
>> we're trying to cut down on the amount of code that can block
>> multipathd
>> on an inaccessible device, instead of increase it.
>> 
>> Perhaps it would be better to make the prioritizer itself smarter, so
>> that it could know when it needs to run the rescan(), and it could do
>> that in a separate thread.
>> 
>> Thoughts?
> 
> A full rescan shouldn't be necessary. All that's needed is that the
> kernel issue another RTPG. AFAICS that should happen as soon as the
> target responds to any command with a sense key of UNIT ATTENTION with
> ASC=0x2a and ascq=6 or 7 (ALUA state change, ALUA state transition
> failed).
> 
> @Brian, does your storage unit not do this? If so, I suggest we disable
> the sysfs prioritizer for pure storage.
> 
> Otherwise, as far as multipathd is concerned, when a path is
> reinstated, it should be sufficient to send any IO command to trigger
> an RTPG. Or am I missing something here?
> 
> Martin

Martin,

What I am gaining with the rescan is exactly that. You are correct the ALUA device handler the kernel has to send an RTPG to the target. 

We do set a unit attention to have the initiator paths go into the ANO state before we reboot leading to the path loss, but we do not set a unit attention when the paths come back up. We have relied on the initiator’s polling to pick up the ALUA state change which they always have in the past and the ‘alua’ prioritizer still will. For us to add a unit attention would work, but there are couple of issues with that.

1. Unit attentions may not get back to the initiator. It is not guaranteed.
2. Paths could take a very long time to come back. We might not get these paths back for a very long time. Sometimes it is just a reboot. Other times it is a hardware replacement. It is possible for us to keep this state forever and post when when that I_T nexus returns but we haven’t had to.

If we did post the unit attention, everything works as expected. I have verified this, but I would also hope that the polling of the checkers would also unstick my stale ALUA state and we won’t have to.

I put this rescan_path inline to show the problem and the fix. I wasn’t sure the ‘right’ place to put it. I get that it would be better not to block on this. It should be possible to put this in a thread so that it does not. The other caller of rescan_path I guess is also doing the same thing when it is handling the wwid change.

Thanks,
Brian






[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux