On 20/12/2011 09:21, Robin Hill wrote:
On Tue Dec 20, 2011 at 09:46:13AM +0100, BERTRAND Joël wrote:
Hello,
I use several softraid volumes for a very long time. Last week, a disk
has crashed on a raid6 volume and I have tried to replace faulty disk.
Today, when Linux boots, it only assembles this volume if the new disk
is marked as 'faulty' or 'removed', and I don't understand...
System is a sparc64-smp server running linux debian/testing :
Root rayleigh:[~]> uname -a
Linux rayleigh 2.6.36.2 #1 SMP Sun Jan 2 11:50:13 CET 2011 sparc64
GNU/Linux
Root rayleigh:[~]> dpkg-query -l | grep mdadm
ii mdadm 3.2.2-1
Faulty device is /dev/sde1 :
Root rayleigh:[~]> cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md7 : active raid6 sdc1[0] sdi1[6] sdh1[5] sdg1[4] sdf1[3] sdd1[1]
359011840 blocks level 6, 64k chunk, algorithm 2 [7/6] [UU_UUUU]
All disks (/dev/sd[cdefghi]) are same model (Fujitsu SCA-2 73 GB) and
each disk only contains one partition (type FD, linux autodetect). If I
add /dev/sde1 to raid6 with mdadm -a /dev/md7 /dev/sde1, disk is added
and my raid6 runs with all disks. But I obtain the same superblock on
/dev/sde1 and /dev/sde ! If I remove /dev/sde superblock, /dev/sde1 one
disappears also (i think that both superblocks are the same).
<- SNIP info ->
All disks return same information except /dev/sde when it is running
(mdadm --examine /dev/sde and mdadm --examine /dev/sde1 return the same
information). What is my mistake ? Is this a known issue ?
It's a known issue with 0.9 superblocks, yes. There's no information in
the superblock with allows md to tell whether it's on the partition or
the disk, so for full-disk partitions the same superblock could be valid
for both. 0.1 superblocks contain extra information which can be used to
differentiate between these. I'm a little surprised that the other
drives don't get detected in the same way though.
I think the above issue only occurs on partitions with particular
alignments, iirc starting at multiples of 8 sectors. Old fdisk would
always create the first partition starting at sector 63, and that was
the case with the output we saw for /dev/sdc, but a new fdisk will
likely create the partition starting at sector 2048.
Alternatively, or additionally, the problem may be that very old fdisk
had a bug where it miscounted and didn't create partitions right up to
the last "cylinder" of the disc, so the md metadata on the last
partition wasn't in the the same place as it would have been if it was
for the whole disc.
Either way, I would recommend that the OP --fail, --remove and
--zero-superblock his /dev/sde1, then copy a working partition table
from sdc with `dd if=/dev/sdc of=/dev/sde bs=512 count=1`, then
`blockdev --rereadpt /dev/sde`, then `fdisk -lu /dev/sde` just to make
sure that there is now an sde1 that's identical to sdc1, then --add the
new /dev/sde1.
Hope this helps!
Cheers,
John.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html