Re: [PATCH] raid0: fix set_disk_faulty doesn't return -EBUSY

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




在 2023/3/22 15:05, Mariusz Tkaczyk 写道:
> On Wed, 22 Mar 2023 10:24:41 +0800
> Wu Guanghao <wuguanghao3@xxxxxxxxxx> wrote:
> 
>> 在 2023/3/21 18:18, Mariusz Tkaczyk 写道:
>>> On Tue, 21 Mar 2023 16:56:37 +0800
>>> Wu Guanghao <wuguanghao3@xxxxxxxxxx> wrote:
>>>   
>>>> The latest kernel version will not report an error through mdadm
>>>> set_disk_faulty.
>>>>
>>>> $ lsblk
>>>> sdb                                           8:16   0   10G  0 disk
>>>> └─md0                                         9:0    0 19.9G  0 raid0
>>>> sdc                                           8:32   0   10G  0 disk
>>>> └─md0                                         9:0    0 19.9G  0 raid0
>>>>
>>>> old kernel:
>>>> ...
>>>> $ mdadm /dev/md0 -f /dev/sdb
>>>> mdadm: set device faulty failed for /dev/sdb:  Device or resource busy
>>>> ...
>>>>
>>>> latest kernel:
>>>> ...
>>>> $ mdadm /dev/md0 -f /dev/sdb
>>>> mdadm: set /dev/sdb faulty in /dev/md0
>>>> ...
>>>>
>>>> The old kernel judges whether the Faulty flag is set in rdev->flags,
>>>> and returns -EBUSY if not. And The latest kernel only return -EBUSY
>>>> if the MD_BROKEN flag is set in mddev->flags. raid0 doesn't set
>>>> error_handler, so MD_BROKEN will not be set, it will return 0.
>>>>
>>>> So if error_handler isn't set for a raid type, also return -EBUSY.  
>>> Hi,
>>> Please test with:
>>> https://lore.kernel.org/linux-raid/20230306130317.3418-1-mariusz.tkaczyk@xxxxxxxxxxxxxxx/
>>>
>>> Thanks,
>>> Mariusz
>>>   
>>
>> Hi, Mariusz
>>
>> Are there other patches?  There are other problems with this patch.
>> https://lore.kernel.org/linux-raid/20230306130317.3418-1-mariusz.tkaczyk@xxxxxxxxxxxxxxx/
>>
>> md_submit_bio()
>> 	...
>> 	// raid0 set disk faulty failed, but MD_BROKEN flag is set,
>> 	// write IO will fail.
>> 	if (unlikely(test_bit(MD_BROKEN, &mddev->flags)) && (rw == WRITE)) {
>> 		bio_io_error(bio);
>> 		return;
>> 	}
>> 	...
>>
>> old kernel:
>> ...
>> $ mdadm /dev/md0 -f /dev/sdb
>> mdadm: set device faulty failed for /dev/sdb:  Device or resource busy
>>
>> $ mkfs.xfs /dev/md0
>> log stripe unit (524288 bytes) is too large (maximum is 256KiB)
>> log stripe unit adjusted to 32KiB
>> meta-data=/dev/md0               isize=512    agcount=16, agsize=1800064 blks
>>          =                       sectsz=512   attr=2, projid32bit=1
>>          =                       crc=1        finobt=1, sparse=1, rmapbt=0
>>          =                       reflink=1    bigtime=0 inobtcount=0
>> data     =                       bsize=4096   blocks=28801024, imaxpct=25
>>          =                       sunit=128    swidth=256 blks
>> naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
>> log      =internal log           bsize=4096   blocks=14064, version=2
>>          =                       sectsz=512   sunit=8 blks, lazy-count=1
>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>> Discarding blocks...Done.
>> ...
>>
>>
>> merged patch kernel:
>> ...
>> # mdadm /dev/md0 -f /dev/sdb
>> mdadm: set device faulty failed for /dev/sdb:  Device or resource busy
>>
>> mkfs.xfs /dev/md0
>> log stripe unit (524288 bytes) is too large (maximum is 256KiB)
>> log stripe unit adjusted to 32KiB
>> meta-data=/dev/md0               isize=512    agcount=8, agsize=65408 blks
>>          =                       sectsz=512   attr=2, projid32bit=1
>>          =                       crc=1        finobt=1, sparse=1, rmapbt=0
>>          =                       reflink=1    bigtime=0 inobtcount=0
>> data     =                       bsize=4096   blocks=523264, imaxpct=25
>>          =                       sunit=128    swidth=256 blks
>> naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
>> log      =internal log           bsize=4096   blocks=2560, version=2
>>          =                       sectsz=512   sunit=8 blks, lazy-count=1
>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>> mkfs.xfs: pwrite failed: Input/output error
>> ...
>>
>>
> Hi Wu,
> Beside the kernel, there are also patches in mdadm. Please check if you have
> them all.
> https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/?id=b3e7b7eb1dfedd7cbd9a3800e884941f67d94c96
> https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/?id=461fae7e7809670d286cc19aac5bfa861c29f93a
> https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/?id=fc6fd4063769f4194c3fb8f77b32b2819e140fb9
> Hi, Mariusz

Thanks for your reply, I would test with the above patches.

> Some background:
> --faulty (-f) is intended to be used by administrators. We cannot rely on
> kernel answer because if mdadm will try to set device faulty, it results in
> MD_BROKEN and every new IO will be failed (and that is intended change).
> 
> Simply, mdadm must check first if it can remove the drive and that was added by
> the mentioned patches. The first patch (the last one) added verification but
> brings regression, the next two patches are fixes for omitted scenarios.
> 
> Thanks,
> Mariusz
> .

> 



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux