Re: Unexpected mdadm behavior with old replugged disc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18/11/17 15:06, Matthias Walther wrote:
> Am 18.11.2017 um 15:58 schrieb Wols Lists:
>> On 18/11/17 14:35, Matthias Walther wrote:
>>> Hello,
>>>
>>> I just signed up for this mailing list to discuss the following,
>>> unexpected behavior:
>>>
>>> Situation: Raid6 with 6 discs. For some reasons, which are unimportant,
>>> I had replaced a disc before, which was fully functional. This disc was
>>> never changed or written to in between.
>>>
>>> Today I replugged this particular disc additionally as 7th disc to the
>>> server (cold plug, server was switched off).
>>>
>>> Unexpectedly mdadm broke up my fully synced raid6 and now syncs back to
>>> this old disc dropping one of the newer discs from the raid.
>>>
>>> This might be because it has its uuid still stored with higher rank than
>>> the newer disc or because the old disc got a lower sdX slot. I don't
>>> know that in detail.
>>>
>>> Anyway, I wouldn't expect mdadm to act like this. It might use the old,
>>> now plugged in again disc as hot spare or ignore it at all. But it
>>> shouldn't break a fully synced raid. I have reduced redundancy for about
>>> 24 hours now - without any rational reason.
> Hello,
> 
> thanks for your quick reply.

>> Just a guess? "mdadm --assemble --incremental"?

> What do you mean with this guess? I didn't do anything. All happened
> automatically
>>
Exactly. As I understand it, this is the command that the boot sequence
runs. It doesn't wait until all the drives are available (it can't know
when all the drives are available), so it adds each drive as it sees it.

>> What I *suspect* happened is that, as the system booted, mdadm scanned
>> the drives as they came available, and because this drive became
>> available before some of the others, it got included in the array.

> Probably.
>>
>> I can't, off the top of my head, think of any way to stop this
>> happening, other than to prevent raid assembling during boot, or having
>> an *accurate* mdadm.conf from which mdadm could realise this drive
>> wasn't meant to be included.
>>
>> Did you update mdadm.conf after you removed this drive? Do you even have
>> an mdadm.conf?

> No, always relied on auto-configuration. So if I had had a correctly
> updated mdadm.conf, this wouldn't have happened?

I don't know. My system doesn't have an mdadm.conf. But if you had an
mdadm.conf, it may well have told mdadm that drive didn't belong there.
>>
>> The only good point here, is that if you had three such drives, mdadm
>> would almost certainly have failed the array as it booted, and left you
>> in an (easily) recoverable situation. I don't really see what else it
>> could have done?
>>
>> Cheers,
>> Wol
> 
> I have to reassembly it manually anyway now. The system crashed, changed
> the order again and doesn't assemble automatically at the moment. I'll
> have to search how to determine to use which discs.

Download the lsdrv command, it'll give you a load of info. And mdadm
--examine or --detail should tell you everything you need to know.
> 
> Once reassembled, I'll create a mdadm.conf to prevent this in the future.
> 
Next time you remove a drive, it would pay you to use --wipe-superblock
or whatever the option is. I suspect using --replace would also flag the
removed drive as no longer valid.

Cheers,
Wol

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux