Re: RFC for multipath queue_if_no_path timeout.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 17, 2013 at 12:03:10PM -0700, Frank Mayhar wrote:
> Dragging this back up into the light...
> 
> On Thu, 2013-09-26 at 19:49 -0400, Mike Snitzer wrote:
> > Frank, I had a look at your patch.  It leaves a lot to be desired, I was
> > starting to clean it up but ultimately found myself agreeing with
> > Alasdair's original point: that this policy should be implemented in the
> > userspace daemon.
> 
> I've found and fixed a couple of bugs but I would still like to know
> what issues you had with the patch.  As I said before, I would be more
> than happy to clean it up.
> 
> In the time since we had this discussion, by the way, we ran into a
> problem that a userspace daemon can't solve:  That of shutdown.  We ran
> into a number of failures in which systems were hung for hours.  It
> turned out that they were caused by a regular system shutdown.  Our
> backing store is network-based and networking was getting killed before
> applications (as is usually the case), leaving I/O outstanding on the
> device.  Since queue_if_no_path was set, the I/O wasn't dumped and our
> daemon was killed by shutdown very shortly thereafter so it couldn't
> recover (otherwise it would have cleaned things up).
> 

Was multipathd force killed? What was the default configuration
parameter "queue_without_daemon" set to?

If "queue_without_daemon" is set to "no", multipathd should disable
queueing when it is stopped. This was added specifically to avoid this
issue.

-Ben

> With those I/Os sitting queued in multipath, with no network and no
> daemon to turn off queue_if_no_path, the systems just sat.  When we
> finally diagnosed this, we realized that the timeout would work
> perfectly to solve the problem, automatically turning queue_if_no_path
> off shortly after the network went away without depending on the
> intervention of the no-longer-running daemon.
> 
> So how do you guys deal with this failure scenario?
> -- 
> Frank Mayhar
> 310-460-4042
> 
> --
> dm-devel mailing list
> dm-devel@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/dm-devel

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux