I have read the description of the failfast feature. According to the phenomenon, it may be not the problem of failfast. Because when there are no io pressure, after stop the disk export on the storage node, the disk will be automatically eliminate from the md disk. However, if there is continuous IO pressure, the disk will not be automatically removed, and the disk will be eliminated immediately after the IO pressure is stopped. Xiao Ni <xni@xxxxxxxxxx> 于2019年1月30日周三 下午5:15写道: > > > > On 01/30/2019 03:25 PM, Jack Wang wrote: > > 李春 <pickup112@xxxxxxxxx> 于2019年1月30日周三 上午7:08写道: > >> # Description of problem: > >> We loaded a disk from two network of storage node via iscsi, merged > >> into a disk through multipath, and made a raid1 with local disk by > >> mdadm. > >> However, when the storage machine of iscsi disk rebooted, raid1 disk > >> does not automatically eject the abnormal disk when there are some IO > >> pressure. > >> > >> # Version-Release number of selected component (if applicable): > >> vermagic: 2.6.32-573.el6.x86_64 SMP mod_unload modversions > >> srcversion: 39AAB97325332236F2FFCA9 > >> > >> # How reproducible: > >> always > >> > >> # Steps to Reproduce: > >> 1. export a disk from storage node > >> 2. load the disk on another node and merge it with multipath > >> 3. assemble a local disk and the multipath by madm to a raid1 disk > >> 4. reboot > >> > >> # Actual results: > >> * multipath disk not eject from raid1 disk under Fio pressure > >> * multipath disk eject immediately from raid1 disk when stop Fio pressure > >> > >> # Expected results: > >> * multipath disk eject immediately from raid1 disk under Fio pressure > >> > >> # Additional info: > >> We have done the following tests: > >> * In rhel6.7 with kernel of 2.6.32-573.el6.x86_64 test, mdadm's raid1 > >> will eliminate the abnormal disk after 5 seconds without IO pressure > >> * In rhel6.7 with kernel of 2.6.32-573.el6.x86_64 test, in the case of > >> IO pressure, mdadm's raid1 will not reject the abnormal disk, until > >> the IO pressure stops, the disk will be removed. > >> * In rhel7.4 with kernel of 3.10.0-693.el7.x86_64 test, mdadm's raid1 > >> will eliminate the abnormal disk after 5 seconds without IO pressure > >> * In rhel7.4 with kernel of 3.10.0-693.el7.x86_64 test, mdadm's raid1 > >> will eliminate abnormal disk after 5 seconds under IO pressure > >> > >> Thanks for your help. > > Sounds like, you want failfast feature in upstream, not sure if RH > > backport it into their kernel. > Thanks for the reporting and analysis. > rhel6 is in the period that it's recommended to fix bugs only. So it > doesn't backport some features. > I'll have a try to backport this to rhel6. > > Regards > Xiao -- 李春 Pickup Li 产品研发部 首席架构师 www.woqutech.com 杭州沃趣科技股份有限公司 杭州市滨江区滨安路1190号智汇中心A座1004室 310052 Hangzhou WOQU Technology Co., Ltd. Room 1004, Building A, D-innovation Center, No. 1190, Bin' an road, Hangzhou 310052 T:(0571) 87770835 M:(86)18989451982 F:(0571) 86805750 E:pickup.li@xxxxxxxxxxxx