Re: [PATCH] multipath-tools: intermittent IO error accounting to improve reliability

Martin Wilck <mwilck@xxxxxxxx> · Mon, 04 Sep 2017 21:36:42 +0200

On Tue, 2017-08-29 at 11:18 +0800, Guan Junxiong wrote:

> Three parameters are added for the admin:
> > > "path_io_err_sample_time",
> > > "path_io_err_num_threshold" and "path_io_err_recovery_time".
> > > If path_io_err_sample_time and path_io_err_recovery_time are set
> > > to a
> > > value greater than 0, when a path fail event occurs due to an IO
> > > error,
> > > multipathd will enqueue this path into a queue of which each
> > > member
> > > is
> > > sent direct reading asynchronous io at a fixed sample rate of
> > > 100HZ. 
> > 
> > 1. I don't think it's prudent to start the flakiness check for
> > every
> > PATH_DOWN event. Rather, the test should only be started for paths
> > that
> > have failed repeatedly in a certain time frame, so that we have
> > reason
> > to assume they're flaky.
> > 
> 
> Sounds reasonable. And IMO, at least two new options of
> multipath.conf
> should be introduced: time frame and threshold if we let the admin
> to tune this. OTOH, if we don't want to bother the admin, we handle
> the pre-flacky check automatically.  Maybe 60 second of time frame
> and 2 PATH_UP->PATH_DOWNH events of thredshold is reasonable.
> Which solution do you prefer?

I wonder if the exisiting san_path_err_threshold etx. could be reused
for that. As I said before, I can't imagine that they could be used
reasonably in parallel with your approach, anyway. I can't help youwith the default values, you have done the testing. 

> > 2. How did you come up with the 100Hz sample rate value? It sounds
> > quite high in the first place (that was my first thought when I
> > read
> > it), and OTOH those processes that run into intermittent IO errors
> > are
> > probably the ones that do I/O almost continuously, so maybe reading
> > continuously for a limited time is the better idea?
> > 
> 
> I am afraid of whether reading for a limited time will affect the
> usable
> bandwidth of the true up-layer transaction. If not, reading
> continuously
> is the nearest to the real case.
> How about 10Hz sample rate and 10ms continuous reading?  In other
> words,
> every 100ms interval, we read continuously for 10ms.
> 
> 
> > 3. I would suggest setting the dm status of paths undergoing the
> > flakiness test to "failed" immediately. That way the IO caused by
> > the
> > test will not interfere with the regular IO on the path. 
> > 
> 
> Great.
> 
> > 4. I can see that you chose aio to avoid having to create a
> > separate
> > thread for each path being checked. But I'm wondering whether using
> > aio
> > for this is a good choice in general. My concern with aio is that
> > to my
> > knowledge there's no good low-level cancellation mechanism. When
> > you
> > io_cancel() one of your requests, you can be sure not to get
> > notified
> > of its completion any more, but that doesn't mean that it wouldn't
> > continue to lurk down in the block layer. But I may be overlooking
> > essential stuff here.
> > 
> > > Thethr
> > > IO accounting process for a path will last for
> > > path_io_err_sample_time.
> > > If the number of IO error on a particular path is greater than
> > > the
> > > path_io_err_num_threshold, then the path will not reinstate for
> > 
> > It would be more user-friendly to allow the user to specify the
> > error
> > rate threshold as a percentage
> 
> Error rate threshold as a percentage is not enough to distinguish
> small
> error rate. How about a permillage rate (1/1000)?
> 
> 
> > > +struct io_err_stat_path *find_err_path_by_dev(vector pathvec,
> > > char
> > > *dev)
> > 
> > You are re-implementing generic functionality here
> > (find_path_by_dev),
> > why?
> > 
> 
> Yes, because the structure is different: struct  io_err_stat_path and
> struct path.

Why don't you just add a struct io_err_stat_path* in "struct path"?

Regards,
Martin

-- 
Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel