Re: RFC: one more time: SCSI device identification

Martin Wilck <martin.wilck@xxxxxxxx> · Tue, 27 Apr 2021 20:33:43 +0000

On Tue, 2021-04-27 at 16:14 -0400, Ewan D. Milne wrote:
> On Mon, 2021-04-26 at 13:16 +0000, Martin Wilck wrote:
> > On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
> > > > > 
> > > > 
> > > > While we're at it, I'd like to mention another issue: WWID
> > > > changes.
> > > > 
> > > > This is a big problem for multipathd. The gist is that the
> > > > device
> > > > identification attributes in sysfs only change after rescanning
> > > > the
> > > > device. Thus if a user changes LUN assignments on a storage
> > > > system,
> > > > it can happen that a direct INQUIRY returns a different WWID as
> > > > in
> > > > sysfs, which is fatal. If we plan to rely more on sysfs for
> > > > device
> > > > identification in the future, the problem gets worse. 
> > > 
> > > I think many devices rely on the fact that they are identified by
> > > Vendor/model/serial_nr, because in most professional SAN storage
> > > systems you
> > > can pre-set the serial number to a custom value; so if you want a
> > > new
> > > disk
> > > (maybe a snapshot) to be compatible with the old one, just assign
> > > the
> > > same
> > > serial number. I guess that's the idea behind.
> > 
> > What you are saying sounds dangerous to me. If a snapshot has the
> > same
> > WWID as the device it's a snapshot of, it must not be exposed to
> > any
> > host(s) at the same time with its origin, otherwise the host may
> > happily combine it with the origin into one multipath map, and data
> > corruption will almost certainly result. 
> > 
> > My argument is about how the host is supposed to deal with a WWID
> > change if it happens. Here, "WWID change" means that a given
> > H:C:T:L
> > suddenly exposes different device designators than it used to,
> > while
> > this device is in use by a host. Here, too, data corruption is
> > imminent, and can happen in a blink of an eye. To avoid this,
> > several
> > things are needed:
> > 
> >  1) the host needs to get notified about the change (likely by an
> > UA
> > of
> > some sort)
> >  2) the kernel needs to react to the notification immediately, e.g.
> > by
> > blocking IO to the device,
> 
> There's no way to do that, in principle.  Because there could be
> other I/Os in flight.  You might (somehow) avoid retrying an I/O
> that got a UA until you figured out if something changed, but other
> I/Os can already have been sent to the target, or issued before you
> get to look at the status.

Right. But in practice, a WWID change will hardly happen under full IO
load. The storage side will probably have to block IO while this
happens, at least for a short time period. So blocking and quiescing
the queue upon an UA might still work, most of the time. Even if we
were too late already, the sooner we stop the queue, the better.

The current algorithm in multipath-tools needs to detect a path going
down and being reinstated. The time interval during which a WWID change
will go unnoticed is one or more path checker intervals, typically on
the order of 5-30 seconds. If we could decrease this interval to a sub-
second or even millisecond range by blocking the queue in the kernel
quickly, we'd have made a big step forward.

Regards
Martin

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/dm-devel