On Mon, 2021-01-04 at 18:33 -0600, Benjamin Marzinski wrote: > On Fri, Dec 18, 2020 at 11:56:47PM +0000, Martin Wilck wrote: > > On Fri, 2020-12-18 at 17:06 -0600, Benjamin Marzinski wrote: > > > I was asked to explain how checker_timeout works for checkers > > > like > > > directio, that don't issue scsi commands with an explicit timeout > > > > > > Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx> > > > --- > > > multipath/multipath.conf.5 | 9 +++++++-- > > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > > > diff --git a/multipath/multipath.conf.5 > > > b/multipath/multipath.conf.5 > > > index ea66a01e..c7c4184b 100644 > > > --- a/multipath/multipath.conf.5 > > > +++ b/multipath/multipath.conf.5 > > > @@ -639,8 +639,13 @@ The default is: \fBno\fR > > > . > > > .TP > > > .B checker_timeout > > > -Specify the timeout to use for path checkers and prioritizers > > > that > > > issue SCSI > > > -commands with an explicit timeout, in seconds. > > > +Specify the timeout to use for path checkers and prioritizers, > > > in > > > seconds. > > > +Only prioritizers that issue scsi commands use checker_timeout. > > > Checkers > > > +that support an asynchronous mode (\fItur\fR and > > > \fIdirectio\fR), > > > will > > > +return shortly after being called by multipathd, regardless of > > > whether the > > > +storage array responds. If the storage array hasn't responded, > > > mulitpathd will > > > > typo > > > > > +check for a response every second, until \fIchecker_timeout\fR > > > seconds have > > > +elapsed. > > > > This is a bit confusing IMHO. Most importantly, checkers will > > consider > > a path down if it doesn't respond to the checker command after the > > given timeout. The async behavior doesn't fit too well into this > > section. Maybe we should mention and explain the async property in > > the > > path_checkers section, and only refer to that here. > > Sounds reasonable. > > > (Btw is the directio checker still deprecated after all the work > > you > > recently put into it? The man page still says so). > > No. I'll change that. There are times when devices claim to be ready > with the tur checker, when in truth, IO to them will fail. In these > cases, the directio checker is the best way to avoid having paths > bouncing up and down. Right. I recently had one such case with persistent reservations. SPC-4 mandates that the status of TUR commands is independent of PR status (TUR is always "allowed"), while obviously ordinary IO would fail if the active PR exclude the current host. That basically makes the TUR checker inappropriate as soon as PRs on SPC-4 compliant devices are in use. As we have support for PR already, I wonder if we could/should extend the TUR checker to take this into account. Cheers, Martin -- Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107 SUSE Software Solutions Germany GmbH HRB 36809, AG Nürnberg GF: Felix Imendörffer -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel