(oops, sending again, the attachment was too big) Hello, 1. mdadm segfault -> write-mostly problem I was changing my SAS drives last month, and now I have 3*Seagate Cheetach 15k rpm, 147 GB connected to an Adaptec ASR-2405 controller. During the migration process of /home I used an extra disc and used lvm for live migration. Now it is on raid10 config. Works great. For migration of /var and /tmp which are on raid1 + lvm I did a following trick: old SAS drives inserted: mdadm --add /dev/md1 --write-mostly /dev/sdf2 (add the extra disc to array) mdadm --fail /dev/md1 /dev/sdc1 (fail old SASs disc) mdadm --remove /dev/md1 /dev/sdc1 mdadm --fail /dev/md1 /dev/sdb1 (fail another old SAS disc) mdadm --remove /dev/md1 /dev/sdb1 turn off PC, replace SAS drives with the new ones, turn on PC, prepare partitions on fresh sda and sdb, then: mdadm --add /dev/md1 /dev/sdc1 mdadm --add /dev/md1 /dev/sdb1 mdadm --fail /dev/md1 /dev/sdf2 mdadm --remove /dev/md1 /dev/sdf2 mdadm --grow --raid-devices=2 /dev/md1 (here I forgot to --zero-superblock on sdf2) reboot. During bootup, I see mdadm segfaulting (a message in dmesg saying that mdadm segfaults). But I was able to bring up /dev/md1 manually, however - only from /dev/sdf2, not from sdc1 + sdb1 (which complained that metadata is not compatibile with metadata from other devices belonging to md1). Therefore since my /dev/md1 was up and working (albeit from /dev/sdf2) I repeated the operation shown above, but this time afterwards I used `dd` to zero completely sdf2. And this time it worked. So it works. But there's one strange quirk left: the two devices sdc1 and sdb1 are in --write-mostly mode! And I used this mode only for sdf2. Not for other devices. Look at md1 here: janek@atak:~$ cat /proc/mdstat Personalities : [raid0] [raid1] [raid10] md2 : active raid10 sda2[0] sdc2[2] sdb2[1] 185381376 blocks super 1.0 512K chunks 2 far-copies [3/3] [UUU] bitmap: 2/6 pages [8KB], 16384KB chunk md1 : active raid1 sdc1[3](W) sdb1[5](W) 9767416 blocks super 1.0 [2/2] [UU] bitmap: 6/150 pages [24KB], 32KB chunk md0 : active raid1 sde1[0] sdd1[2] sda1[1] 9767424 blocks [3/3] [UUU] bitmap: 1/150 pages [4KB], 32KB chunk unused devices: <none> Is it possible to change that (W) flag? Does it decrease performance or something, if both devices in the array are set to (W) ? Attached the bootup dmesg, with mdadm segfaults. 2. smart ? Now I am trying to get smartd running and supporting those SAS drives. I managed to get some smart info from them with this command: sudo modprobe sg sudo smartctl --all /dev/sg4 -d scsi but I don't know what should I write in /etc/smartd.conf DEVICESCAN insists on not scanning /dev/sg? devices, even when instructed directly to do so: DEVICESCAN -d removable -d sata -d scsi,/dev/sg4 -d scsi,/dev/sg5 -d scsi,/dev/sg6 -d scsi,/dev/sg7 -n standby -m root -M exec /usr/share/smartmontools/smartd-runner or not directly: DEVICESCAN -d sata -d scsi -n standby -m root -M exec /usr/share/smartmontools/smartd-runner I tried without DEVICESCAN, and putting instead those lines: /dev/sde -H -l error -l selftest -t /dev/sg4 -H -l error -l selftest -t /dev/sg5 -H -l error -l selftest -t /dev/sg6 -H -l error -l selftest -t /dev/sg7 -H -l error -l selftest -t still nothing. Putting into /etc/default/smartmontools this line: enable_smart="/dev/sde /dev/sg4 /dev/sg5 /dev/sg6 /dev/sg7" doesn't help either. I know that this is not an mdadm related problem, but since you all here are dealing with HDDs I though I would ask, maybe someone will know. 3. my system versions: debian squeeze fully up to date, except kernel version (due to TuxOnIce patch on vanilla kernel). atak:~# uname -a Linux atak 2.6.29-bpo.2-amd64 #1 SMP Fri Jul 10 15:23:52 CEST 2009 x86_64 GNU/Linux atak:~# mdadm --version mdadm - v3.1.2 - 10th March 2010 atak:~# smartd --version smartd 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build) thanks in advance for your help -- Janek Kozicki http://janek.kozicki.pl/ |
Attachment:
mdadm-segfault.txt.gz
Description: GNU Zip compressed data