Re: mdadm's raid1 will not eliminate abnormal disk after 5 seconds under IO pressure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have read the description of the failfast feature. According to the
phenomenon, it may  be not the problem of failfast.
Because when there are no io pressure, after stop the disk export on
the storage node,  the disk will be automatically eliminate from the
md disk.
However, if there is continuous IO pressure, the disk will not be
automatically removed, and the disk will be eliminated immediately
after the IO pressure is stopped.

Xiao Ni <xni@xxxxxxxxxx> 于2019年1月30日周三 下午5:15写道:
>
>
>
> On 01/30/2019 03:25 PM, Jack Wang wrote:
> > 李春 <pickup112@xxxxxxxxx> 于2019年1月30日周三 上午7:08写道:
> >> # Description of problem:
> >> We loaded a disk from two network of storage node via iscsi, merged
> >> into a disk through multipath, and made a raid1 with  local disk by
> >> mdadm.
> >> However, when the storage machine of iscsi disk rebooted,  raid1 disk
> >> does not automatically eject the abnormal disk when there are some IO
> >> pressure.
> >>
> >> # Version-Release number of selected component (if applicable):
> >> vermagic: 2.6.32-573.el6.x86_64 SMP mod_unload modversions
> >> srcversion: 39AAB97325332236F2FFCA9
> >>
> >> # How reproducible:
> >> always
> >>
> >> # Steps to Reproduce:
> >> 1. export a disk from storage node
> >> 2. load the disk on another node and merge it with multipath
> >> 3. assemble a local disk and the multipath by madm to a raid1 disk
> >> 4. reboot
> >>
> >> # Actual results:
> >> * multipath disk not eject from raid1 disk under Fio pressure
> >> * multipath disk eject immediately from raid1 disk when stop Fio pressure
> >>
> >> # Expected results:
> >> * multipath disk eject immediately from raid1 disk under Fio pressure
> >>
> >> # Additional info:
> >> We have done the following tests:
> >> * In rhel6.7 with kernel of 2.6.32-573.el6.x86_64 test, mdadm's raid1
> >> will eliminate the abnormal disk after 5 seconds without IO pressure
> >> * In rhel6.7 with kernel of 2.6.32-573.el6.x86_64 test, in the case of
> >> IO pressure, mdadm's raid1 will not reject the abnormal disk, until
> >> the IO pressure stops, the disk will be removed.
> >> * In rhel7.4 with kernel of 3.10.0-693.el7.x86_64 test, mdadm's raid1
> >> will eliminate the abnormal disk after 5 seconds without IO pressure
> >> * In rhel7.4 with kernel of  3.10.0-693.el7.x86_64 test, mdadm's raid1
> >> will eliminate abnormal disk after 5 seconds under IO pressure
> >>
> >> Thanks for your help.
> > Sounds like, you want failfast feature in upstream, not sure if RH
> > backport it into their kernel.
> Thanks for the reporting and analysis.
> rhel6 is in the period that it's recommended to fix bugs only. So it
> doesn't backport some features.
> I'll have a try to backport this to rhel6.
>
> Regards
> Xiao



-- 
李春 Pickup Li
产品研发部  首席架构师

www.woqutech.com
杭州沃趣科技股份有限公司

杭州市滨江区滨安路1190号智汇中心A座1004室  310052
Hangzhou WOQU Technology Co., Ltd.
Room 1004, Building A, D-innovation Center, No. 1190, Bin' an road,
Hangzhou 310052


T:(0571) 87770835
M:(86)18989451982
F:(0571) 86805750
E:pickup.li@xxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux