Re: Incorrect in-kernel bitmap on raid10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, May 3, 2009 5:52 am, Mario 'BitKoenig' Holbe wrote:
> I guess, there is something else going wrong here. I attached a (quite
> large, sorry for that) transcript of what I was doing with the output of
> what I think could help.
>
> What I did:
> * I had a stable and clean raid10 out of 6 disks with superblocks all
>   uptodate, on-disk bitmaps all uptodate.
> * I failed and removed 3 of the disks.
> * I set some bits in the bitmap via mounting/umounting the raid10.
> * I stopped the raid10 just to make sure all superblocks/bitmaps are
>   uptodate.
> * I assembled the raid10 again with 3 out of 6 devices now.
> * I re-added the 3 missing disks.
>   Please note, that I cannot add them all at the same time because if
>   the array is read-write the resync starts immediately when the first
>   device is added, while I cannot add devices as long as the array is
>   read-only.
>   In this immediately starting re-sync only the first of the three
>   spares is synched, it seems to ignore the bitmap, and it generates
>   I/O all the time, I just forgot to c'n'p the evidence. I have seen
>   that 3 times now with iostat, it really does I/O.
> * I stopped and started the array to make it resync over all 3 spares
>   concurrently, it seems to ignore the bitmap, and it generates I/O
>   all the time again.

I managed to reproduce this thanks to all the detail you provided.
The problem was caused by trying to add a device to the array while the
array was readonly.
mdadm attempts a re-add.  When this fails it tries a conventional add
which involves writing  new metadata which shows the device to be a
spare.  That causes the information that would allow the fast resync
to be destroyed.

Can you duplicated the problem without setting the array to readonly?

The next mdadm release will check for EROFS from the re-add attempt and
not attempt the conventional add, thus saving the metadata.

Yes, it would be nice to be able to add multiple devices at once.
Maybe I could just get the kernel to wait 100ms after an add before
starting recovery in case a second device is about to be added.
I'll give it some thought.

NeilBrown

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux