Re: Linux RAID autodetect partitions go missing from /dev, but fdisk can see them

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Neil!

Neil Brown wrote:
> On Wednesday December 10, torarnv@xxxxxxxxx wrote:
>> I have a very strange problem that I've been trying to debug for
>> days now. I had a RAID5 with four drives and one spare,
>> /dev/sd[bcde]1 + /dev/sdf1, and everything was working fine, until
>> one day one of the drives in the array (sdb) no longer had a
>> partition (sdb1). Letting the spare take over I ignored this for a
>> few days, but then it happened again, this time with sdc1.

>> I'm hoping someone on this list may have ran into this before, or
>> have any tips on how I can continue debugging this, because I have to
>> admit I'm a little lost...
> 
> Yes, it does sound rather weird.

First of all, thank you so much for helping me out with this, as I'm
still very lost :)

In addition to the things listed in the first e-mail, I've also tried
installing the latest kernel from kernel.org, but that did not solve
anything. Also, in case it's relevant, I'm running openSUSE 10.3.

> Can you:
> 
>   mdadm -Esv

http://pastebin.com/d7b14d14e

For some reason it seems to think that /dev/sdc and /dev/sdb are part of
the array, while it really is /dev/sdc1 and /dev/sdb1. I'm guessing
since they are missing somehow from the device nodes in /dev mdadm
assumes the disk itself is the member?

> and
>   mdadm --stop /dev/md0
>   strace -o /tmp/str -s 200 mdadm --assemble --scan --verbose /dev/md0

http://pastebin.com/f2c1db2e4

The original array had sd[bcde]1 + sdf1 as spare. Then sdb1 went missing
and the spare kicked in, and then sdc1 went missing, leaving me with a
degraded array.

> Also the contents of /etc/mdadm.conf might help.

http://pastebin.com/f573346ef

Is there anything else I can run, cat, and/or paste that would shed
light over what's going on?

> Thanks,

Thank _you_ :)

Tor Arne



>> raid support in. The symptoms are:
>>
>>   - The kernel seems to detect the partitions (lines 396 and 407 in the
>> dmesg [1])
>>
>>   - But once the boot process finishes and the RAID is started, there is
>> no longer any sdc1 or sdb1, so the RAID fails to start (lines 550-576 in
>> dmesg [1])
>>
>>   - Running fdisk -l shows that the drives in question (sdb and sdc) do
>> have similar partitions as the other working drives, namely one Linux
>> RAID autodetect partition each (see command output [2])
>>
>>   - But, the partitions are missing from /proc/partitions (see [3])
>>
>>   - Manually adding device nodes using mknod works, but doing file -sL
>> on the device gives "writable, no read permission", even though
>> permissions are the same as the other sd* nodes in /dev
>>
>>   - Running 'partprobe -s' successfully finds the two missing partitions
>> and adds device nodes, and the nodes can be 'file -sL'ed, but when
>> trying to assemble the array again with these new nodes in the system,
>> I'm told that sdc1 is not found, and after the --assemble is done, the
>> device nodes are once again missing (!) see [4]
>>
>>   - I've tried using the 'dmraid' command to look for fakeraid
>> partitions or meta data on the drives, which I was told could mess up
>> the auto-detection of Linux software ride partitions, but could not find
>> any issues.
>>
>>
>> As you can tell I've exhausted all my current options, so any help on
>> what I could try next would be very much appreciated. I am especially
>> curious as to why I lose the partitions when mdadm tries to assemble the
>> array?
>>
>> Thanks!
>>
>> Tor Arne Vestbø
>>
>> [1] http://pastebin.com/m15b9c275   dmesg
>> [2] http://pastebin.com/f50fb323a   fdisk -l
>> [3] http://pastebin.com/f4547c2ca   cat /proc/partitions
>> [4] http://pastebin.com/m4475c9ae   partprobe + mdadm --assemble
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux