Re: [PATCH] multipathd: the sysfs prioritizer can return stale data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2024-01-30 at 11:41 -0500, Benjamin Marzinski wrote:
> On Tue, Jan 23, 2024 at 06:00:07PM -0800, Brian Bunker wrote:
> > When a path is lost and then reinstated later, the ALUA
> > device handler will not pick up this change and continue
> > to possibly provide incorrect (stale) information about
> > its ALUA state to the 'sysfs' prioritizer if the path's
> > priorities were changed prior to the loss of those paths.
> > 
> > On the loss of an I_T nexus, the path state should not
> > continue to use its last known state. Many things and a lot
> > of time could have passed between the path loss and its
> > eventual reinstatemnt.
> > 
> > The ALUA device handler didn't have this issue since it
> > always got the ALUA state from the target by sending an RTPG
> > request on the path and updated the priority. However with
> > the detect priority set to true by default, the 'sysfs'
> > prioritzier will be used on targets that support ALUA rather
> > than the 'alua' prioritizer.
> > 
> > Without re-evaluating the ALUA state when a path returns
> > multipath is left with path states which do not reflect
> > the actual ALUA states the target is providing.
> > 
> > 3624a9370f0f545fc7c3e46a100011010 dm-2 PURE,FlashArray
> > size=3.0T features='0' hwhandler='1 alua' wp=rw
> > > -+- policy='service-time 0' prio=50 status=active
> > > > - 13:0:0:1 sdg 8:96  active ready running
> > > > - 12:0:0:1 sdf 8:80  active ready running
> > > > - 9:0:0:1  sdc 8:32  active ready running
> > > `- 1:0:0:1  sdb 8:16  active ready running
> > `-+- policy='service-time 0' prio=10 status=enabled
> >   |- 14:0:0:1 sdh 8:112 active ready running
> >   |- 15:0:0:1 sdi 8:128 active ready running
> >   |- 10:0:0:1 sdd 8:48  active ready running
> >   `- 11:0:0:1 sde 8:64  active ready running
> > 
> >  # sg_rtpg /dev/sdh (Active/Optimized)
> > target port group asymmetric access state : 0x00
> > 
> > Signed-off-by: Brian Bunker <brian@xxxxxxxxxxxxxxx>
> > Signed-off-by: Seamus Connor <sconnor@xxxxxxxxxxxxxxx>
> > ---
> >  multipathd/main.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/multipathd/main.c b/multipathd/main.c
> > index 230c9d10..dd48be74 100644
> > --- a/multipathd/main.c
> > +++ b/multipathd/main.c
> > @@ -1937,6 +1937,11 @@ reinstate_path (struct path * pp)
> >  	else {
> >  		condlog(2, "%s: reinstated", pp->dev_t);
> >  		update_queue_mode_add_path(pp->mpp);
> > +		if (strcmp(pp->prio.name, PRIO_SYSFS) == 0) {
> > +			condlog(2, "%s: rescan target to update
> > priorities",
> > +				pp->dev_t);
> > +			rescan_path(pp->udev);
> 
> I can see why this is necessary with the sysfs target, but AFAICT
> rescanning the scsi device will end up calling scsi_execute_cmd to
> update the vpd pages, and this can block.  Obviously, we are doing
> this
> after the checker has verified that the path is up, but even still,
> we're trying to cut down on the amount of code that can block
> multipathd
> on an inaccessible device, instead of increase it.
> 
> Perhaps it would be better to make the prioritizer itself smarter, so
> that it could know when it needs to run the rescan(), and it could do
> that in a separate thread.
> 
> Thoughts?

A full rescan shouldn't be necessary. All that's needed is that the
kernel issue another RTPG. AFAICS that should happen as soon as the
target responds to any command with a sense key of UNIT ATTENTION with
ASC=0x2a and ascq=6 or 7 (ALUA state change, ALUA state transition
failed).

@Brian, does your storage unit not do this? If so, I suggest we disable
the sysfs prioritizer for pure storage.

Otherwise, as far as multipathd is concerned, when a path is
reinstated, it should be sufficient to send any IO command to trigger
an RTPG. Or am I missing something here?

Martin







[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux