Ok, I guess those changes can come incrementally over this patch then.
Applied.On Mon, Jun 9, 2014 at 10:22 PM, Benjamin Marzinski <bmarzins@xxxxxxxxxx> wrote:
On Thu, May 15, 2014 at 11:45:40PM +0200, Christophe Varoqui wrote:Sorry I dropped the ball on this one.
> Ben,
> I'd need your ack on this one.
> Best regards,
> Christophe Varoqui
I'm o.k. with this patch. The biggest issue I have with it has nothing
to do with its correctness, but with rlookup_wwid()'s use of scan_device.
Previously, the only scan_device call always failed. Now scan every
device name, but we don't ever get anything out of it. First off, if we
find a match, we will never use the id. Second, if we don't find a match we
return the id that of the alias we were looking for, but if we do find a
match we return the next id after the one we were looking for (which is
completely pointless).
It seems like we could just make rlookup_wwid() return success or failure,
and then call scan_device() from use_existing_alias() if we need to, and
take out a bunch of pointless work that rlookup_wwid() is doing.
-Ben
>
> On Thu, May 15, 2014 at 9:21 PM, Stewart, Sean
> References> <[1]Sean.Stewart@xxxxxxxxxx> wrote:
>
> Ping... Any additional comments or suggestions for this patch?
> Bumping in case it got lost in the backlog. :)
> On Fri, 2014-04-11 at 17:01 +0000, Stewart, Sean wrote:
> > On Fri, 2014-04-11 at 17:03 +0100, Bryn M. Reeves wrote:
> > > On Fri, Mar 28, 2014 at 09:01:14PM +0000, Stewart, Sean wrote:
> > > > When a system is booted to the SAN, a condition can occur where
> one
> > > > user friendly name is given to a disk during boot, but multipathd
> tries
> > > > to allocate a different one after boot. If the second alias is
> already
> > > > used by another device, multipathd can't rename it. Multipathd
> then has
> > > > incorrect information about the alias/wwid relationships, which
> can
> > > > result in paths being added to the wrong map.
> > >
> > > This should only happen if the initramfs and root file system have
> > > inconsistent multipath configurations (either multipath.conf or
> bindings
> > > / wwids file mismatched). That's not really a valid configuration
> for
> > > the system to be in and leads to the type of problems you describe.
> >
> > That is true that it only happens if they are out of sync. We tried
> > remaking the initramfs to fix the problem, but it didn't help.
> > >
> > > > This patch works around this problem by first trying to use the
> alias
> > > > already bound to a device during boot. If the bindings file has
> that
> > > > alias bound to a different device, it'll auto generate a new alias
> to
> > > > rename it to.
> > >
> > > To be honest I'd prefer to see this cause an error. These types of
> > > configurations currently run the risk of silent data corruption -
> I'd
> > > much rather deal with a system that refuses to boot due to an out of
> > > date initramfs image than one that quietly remaps paths in
> unexpected
> > > ways.
> >
> > The issue, though, is that the system does not refuse to boot. In the
> > case we saw, it booted anyway, our QA engineer ran a test, and it
> ended
> > with a data corruption. A user could perform a fresh installation,
> > map
> > new luns, reboot, and without any way of realizing it have essentially
> a
> > ticking time bomb on their hands, ready to go off as soon as there's a
> > blip in the SAN.
>
>
> Visible links
> 1. mailto:Sean.Stewart@xxxxxxxxxx
-- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel