Re: IMSM: Drive removed during I/O is set to faulty but not removed from volume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 18 Jul 2024 16:57:03 +0200
Mateusz Kusiak <mateusz.kusiak@xxxxxxxxxxxxxxx> wrote:

> Hello,
> recently we discovered an issue regarding drive removal during I/O.
> 
> Description:
> Drive removed during I/O from IMSM R1D2 array is being set to faulty
> but is not removed from a volume. I/O on the array hangs.
> 
> The scenario is as follows:
> 1. Create R1D2 IMSM array.
> 2. Create single partition, format it as ext4 and mount is somewhere.
> 3. Start multiple checksum tests processes (more on that below) and
> wait a while. 4. Unplug one RAID member.
> 

Thanks Mateusz, can you confirm if this is only with imsm metadata? In
other words with native metadata is this an issue or not?

-Paul

> About "Checksum test":
> Checksum test creates ~3GB file and calculates it's checksum twice.
> It basically does the following: # dd if=/proc/kcore bs=1024
> count=3052871 status=none | tee <filename> | md5sum ...and then
> recalculates checksum to verify if it matches. In this scenario we
> use it to simulate I/O, by running multiple tests.
> 
> Expected result:
> Raid member is removed from the volume and the container, array
> continues operation on one drive.
> 
> Actual result:
> Raid member is set to faulty on volume and does not disappear (it's
> not removed), but it is removed from a container. I\O on mounted
> volume hangs.
> 
> Additional notes:
> The issue reproduces on kernel-next. We bisected that potential cause
> of the issue might be patch "md: use new apis to suspend array for
> adding/removing rdev from state_store()"
> (cfa078c8b80d0daf8f2fd4a2ab8e26fa8c33bca1) as it's the first one we
> observe the issue on our reproduction setup.
> 
> Having said that, we also observed the issue for example on SLES15SP6
> with kernel 6.4.0-150600.10-default, which might indicate that the
> problem was here, but became apparent for some reason (race-condition
> or something else).
> 
> I will work on simplifying the scenario and try to provide script for
> reproduction.
> 
> Thank,
> Mateusz
> 





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux