On Tue, 2017-08-29 at 11:18 +0800, Guan Junxiong wrote: > Three parameters are added for the admin: > > > "path_io_err_sample_time", > > > "path_io_err_num_threshold" and "path_io_err_recovery_time". > > > If path_io_err_sample_time and path_io_err_recovery_time are set > > > to a > > > value greater than 0, when a path fail event occurs due to an IO > > > error, > > > multipathd will enqueue this path into a queue of which each > > > member > > > is > > > sent direct reading asynchronous io at a fixed sample rate of > > > 100HZ. > > > > 1. I don't think it's prudent to start the flakiness check for > > every > > PATH_DOWN event. Rather, the test should only be started for paths > > that > > have failed repeatedly in a certain time frame, so that we have > > reason > > to assume they're flaky. > > > > Sounds reasonable. And IMO, at least two new options of > multipath.conf > should be introduced: time frame and threshold if we let the admin > to tune this. OTOH, if we don't want to bother the admin, we handle > the pre-flacky check automatically. Maybe 60 second of time frame > and 2 PATH_UP->PATH_DOWNH events of thredshold is reasonable. > Which solution do you prefer? I wonder if the exisiting san_path_err_threshold etx. could be reused for that. As I said before, I can't imagine that they could be used reasonably in parallel with your approach, anyway. I can't help youwith the default values, you have done the testing. > > 2. How did you come up with the 100Hz sample rate value? It sounds > > quite high in the first place (that was my first thought when I > > read > > it), and OTOH those processes that run into intermittent IO errors > > are > > probably the ones that do I/O almost continuously, so maybe reading > > continuously for a limited time is the better idea? > > > > I am afraid of whether reading for a limited time will affect the > usable > bandwidth of the true up-layer transaction. If not, reading > continuously > is the nearest to the real case. > How about 10Hz sample rate and 10ms continuous reading? In other > words, > every 100ms interval, we read continuously for 10ms. > > > > 3. I would suggest setting the dm status of paths undergoing the > > flakiness test to "failed" immediately. That way the IO caused by > > the > > test will not interfere with the regular IO on the path. > > > > Great. > > > 4. I can see that you chose aio to avoid having to create a > > separate > > thread for each path being checked. But I'm wondering whether using > > aio > > for this is a good choice in general. My concern with aio is that > > to my > > knowledge there's no good low-level cancellation mechanism. When > > you > > io_cancel() one of your requests, you can be sure not to get > > notified > > of its completion any more, but that doesn't mean that it wouldn't > > continue to lurk down in the block layer. But I may be overlooking > > essential stuff here. > > > > > Thethr > > > IO accounting process for a path will last for > > > path_io_err_sample_time. > > > If the number of IO error on a particular path is greater than > > > the > > > path_io_err_num_threshold, then the path will not reinstate for > > > > It would be more user-friendly to allow the user to specify the > > error > > rate threshold as a percentage > > Error rate threshold as a percentage is not enough to distinguish > small > error rate. How about a permillage rate (1/1000)? > > > > > +struct io_err_stat_path *find_err_path_by_dev(vector pathvec, > > > char > > > *dev) > > > > You are re-implementing generic functionality here > > (find_path_by_dev), > > why? > > > > Yes, because the structure is different: struct io_err_stat_path and > struct path. Why don't you just add a struct io_err_stat_path* in "struct path"? Regards, Martin -- Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel