On 2017/11/6 20:15, Muneendra Kumar M wrote: > Hi Guan, > Any update on this patch ? >> Regards, > Muneendra. > It's not yet merged. It's waiting for Christophe's merging. Hope Christophe can give any feedback soon. BTW, your clients ( and my clients) can keep using this patch until it is really merged into the mainline. Please wait . I think Christophe will eventually pick up this patch. Best wishes. Guan > -----Original Message----- > From: Guan Junxiong [mailto:guanjunxiong@xxxxxxxxxx] > Sent: Thursday, November 02, 2017 6:20 AM > To: christophe.varoqui@xxxxxxxxxxx > Cc: dm-devel@xxxxxxxxxx; Muneendra Kumar M <mmandala@xxxxxxxxxxx>; mwilck@xxxxxxxx; shenhong09@xxxxxxxxxx; niuhaoxin@xxxxxxxxxx > Subject: Re: [PATCH v7 0/2] multipath-tools: intermittent IO error accounting to improve reliability > > Dear Christophe, > > Could you please consider applying this patch or give any feedback about it? > We (Huawei and Brocade) are looking forward to you reply. > Thanks. > > Regards > Guan Junxiong > > . > > > On 2017/10/24 9:57, Guan Junxiong wrote: >> Hi Christophe and All, >> >> This patch set adds a new method of path state checking based on >> accounting IO error. This is useful in many scenarios such as >> intermittent IO error on a path due to intermittent frame drops, >> intermittent corruptions, network congestion or a shaky link. >> >> This patch set is of significance because of this (quoted from the >> discussion with Muneendra, Brocade): >> >> There are typically two type of SAN network problems that are >> categorized as marginal issues. These issues by nature are not >> permanent in time and do come and go away over time. >> 1) Switches in the SAN can have intermittent frame drops or intermittent >> frame corruptions due to bad optics cable (SFP) or any such wear/tear port >> issues. This causes ITL flows that go through the faulty switch/port to >> intermittently experience frame drops. >> 2) There exists SAN topologies where there are switch ports in the fabric >> that becomes the only conduit for many different ITL(host--target--LUN) >> flows across multiple hosts. These single network paths are essentially >> shared across multiple ITL flows. Under these conditions if the port link >> bandwidth is not able to handle the net sum of the shared ITL flows bandwidth >> going through the single path then we could see intermittent network >> congestion problems. This condition is called network oversubscription. >> The intermittent congestions can delay SCSI exchange completion time >> (increase in I/O latency is observed). >> >> To overcome the above network issues and many more such target issues, >> there are frame level retries that are done in HBA device firmware and >> I/O retries in the SCSI layer. These retries might succeed because of two reasons: >> 1) The intermittent switch/port issue is not observed >> 2) The retry I/O is a new SCSI exchange. This SCSI exchange can take an >> alternate SAN path for the ITL flow, if such an SAN path exists. >> 3) Network congestion disappears momentarily because the net I/O bandwidth >> coming from multiple ITL flows on the single shared network path is >> something the path can handle >> >> However in some cases we have seen I/O retries don't succeed because >> the retry I/Os hits a SAN network path that has intermittent >> switch/port issue and/or network congestion. >> >> On the host thus we see configurations two or more ITL path sharing >> the same target/LUN going through two or more HBA ports. These HBA >> ports are connected to two or more SAN to the same target/LUN. >> If the I/O fails at the multipath layer then, the ITL path is turned >> into Failed state. Because of the marginal nature of the network, the >> next Health Check command sent from multipath layer might succeed, >> which results in making the ITL path into Active state. You end up >> seeing the DM path state going into Active, Failed, Active >> transitions. This results in overall reduction in application I/O >> throughput and sometime application I/O failures (because of timing >> constraints). All this can happen because of I/O retries and I/O >> request moving across multiple paths of the DM device. In the host it >> is to be noted all I/O retries on a single path and I/O movement >> across multiple paths results in slowing down the forward progress of >> new application I/O. Reason behind, the above I/O re-queue actions are given higher priority than the newer I/O requests coming from the application. >> >> The above condition of the ITL path is hence called "marginal". >> >> What we desire is for the DM to deterministically categorize a ITL >> Path as “marginal” and move all the pending I/Os from the marginal >> Path to an Active Path. This will help in meeting application I/O >> timing constraints. Also a capability to automatically re-instantiate >> the marginal path into Active once the marginal condition in the network is fixed. >> >> >> Here is the description of implementation: >> 1) PATCH 1/2 implements the algorithm that sends a couple of >> continuous IOs to a path which suffers two failed events in less than >> a given time. Those IOs are sent at a fix rate of 10 Hz. >> 2) PATCH 2/2 discard the original algorithm because of this: >> the detect sample interval of that path checkers is so big/coarse that >> it doesn't see what happens in the middle of the sample interval. We >> have the PATCH 1/2 as a better method. >> >> >> Changes from V6: >> * fix the warning of unwrapped commit description in patch 1/2 >> * add Reviewed-by tag of Muneendra >> * add detailed scenario discription in the cover letter >> >> Changes from V5: >> * rebase on the latest release 0.7.3 >> >> >> Changes from V4: >> * path_io_err_XXX -> marginal_path_err_XXX. (Mumeendra) >> * add one more parameters named marginal_path_double_failed_time instead >> of the fixed 60 seconds for the pre-checking of a shaky path. >> (Martin) >> * fix for "reschedule checking after %d seconds" log >> * path_io_err_recovery_time -> marginal_path_err_recheck_gap_time. >> * put the marginal path into PATH_SHAKY instead of PATH_DELAYED >> * Modify the commit comments to sync with the changes above. >> >> >> Changes from V3: >> * add a patch for discard the san_path_XXX_feature >> * fail the path in the kernel before enqueueing the path for checking >> rather than after knowing the checking result to make it more >> reliable. (Martin) >> * use posix_memalign instead of manual alignment for direct IO buffer. >> (Martin) >> * use PATH_MAX to avoid certain compiler warning when opening file >> rather than FILE_NAME_SIZE. (Martin) >> * discard unnecessary sanity check when getting block size (Martin) >> * do not return 0 in send_each_aync_io if io_starttime of a path is >> not set(Martin) >> * Wait 10ms instead of 60 second if every path is down. (Martin) >> * rename handle_async_io_timeout to poll_async_io_timeout and use polling >> method because io_getevents does not return 0 if there are timeout IO >> and normal IO. >> * rename hit_io_err_recover_time ro hit_io_err_recheck_time >> * modify the multipath.conf.5 and commit comments to keep sync with the >> above changes >> >> >> Changes from V2: >> * fix uncondistional rescedule forverver >> * use script/checkpatch.pl in Linux to cleanup informal coding style >> * fix "continous" and "internel" typos >> >> >> Changes from V1: >> * send continous IO instead of a single IO in a sample interval >> (Martin) >> * when recover time expires, we reschedule the checking process >> (Hannes) >> * Use the error rate threshold as a permillage instead of IO >> number(Martin) >> * Use a common io_context for libaio for all paths (Martin) >> * Other small fixes (Martin) >> >> >> Junxiong Guan (2): >> multipath-tools: intermittent IO error accounting to improve >> reliability >> multipath-tools: discard san_path_err_XXX feature >> >> libmultipath/Makefile | 5 +- >> libmultipath/config.c | 3 - >> libmultipath/config.h | 21 +- >> libmultipath/configure.c | 7 +- >> libmultipath/dict.c | 88 +++--- >> libmultipath/io_err_stat.c | 744 >> +++++++++++++++++++++++++++++++++++++++++++++ >> libmultipath/io_err_stat.h | 15 + >> libmultipath/propsel.c | 70 +++-- >> libmultipath/propsel.h | 7 +- >> libmultipath/structs.h | 15 +- >> libmultipath/uevent.c | 32 ++ >> libmultipath/uevent.h | 2 + >> multipath/multipath.conf.5 | 89 ++++-- >> multipathd/main.c | 140 ++++----- >> 14 files changed, 1043 insertions(+), 195 deletions(-) create mode >> 100644 libmultipath/io_err_stat.c create mode 100644 >> libmultipath/io_err_stat.h >> > -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel