And then there was Bryce...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Gosh, where to start,..

Ok general setup

I'm using  kernel version 2.6.17-rc5 and  Raid 5 over 5 500Gb SATA disks

(boring dump)
-----------------------------------------------------------------------
[root@emerald ~]# mdadm -D /dev/md0
/dev/md0:
       Version : 00.90.03
 Creation Time : Sat May 27 20:49:13 2006
    Raid Level : raid5
    Array Size : 1953533952 (1863.04 GiB 2000.42 GB)
   Device Size : 488383488 (465.76 GiB 500.10 GB)
  Raid Devices : 5
 Total Devices : 5
Preferred Minor : 0
   Persistence : Superblock is persistent

   Update Time : Thu Jun  8 01:05:24 2006
         State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
 Spare Devices : 0

        Layout : left-symmetric
    Chunk Size : 1024K

          UUID : d8d7cacb:24db29e6:46ace8ec:49547cc4
        Events : 0.143369

   Number   Major   Minor   RaidDevice State
      0       8       17        0      active sync   /dev/sdb1
      1       8       33        1      active sync   /dev/sdc1
      2       8       49        2      active sync   /dev/sdd1
      3       8       65        3      active sync   /dev/sde1
      4       8       81        4      active sync   /dev/sdf1
-----------------------------------------------------------------------

Anyway, I happen to have a 512MB USB pen drive that I was playing with earlier that I left attached over a reboot

What follows is horrifying.

From the syslog...

Jun  7 18:47:10 Emerald syslogd 1.4.1: restart.
Jun 7 18:47:10 Emerald kernel: klogd 1.4.1, log source = /proc/kmsg started. Jun 7 18:47:10 Emerald kernel: Linux version 2.6.17-rc5 (root@emerald) (gcc version 4.1.0 20060304 (Red Hat 4.1.0-3)) #2 SMP Sun May 28 15:29:46 BST 2006
...
everything going ok,.. normal boot
and then it all goes horribly wrong,...


Jun 7 18:52:30 Emerald kernel: raid5: Disk failure on sde1, disabling device. Operation continuing on 3 devices
Jun  7 18:52:30 Emerald kernel: RAID5 conf printout:
Jun  7 18:52:30 Emerald kernel:  --- rd:5 wd:3 fd:2
Jun  7 18:52:30 Emerald kernel:  disk 0, o:1, dev:sdb1
Jun  7 18:52:30 Emerald kernel:  disk 1, o:1, dev:sdd1
Jun  7 18:52:30 Emerald kernel:  disk 2, o:0, dev:sde1
Jun  7 18:52:30 Emerald kernel:  disk 4, o:1, dev:sdg1
Jun  7 18:52:30 Emerald kernel: RAID5 conf printout:
Jun  7 18:52:30 Emerald kernel:  --- rd:5 wd:3 fd:2
Jun  7 18:52:30 Emerald kernel:  disk 0, o:1, dev:sdb1
Jun  7 18:52:30 Emerald kernel:  disk 1, o:1, dev:sdd1
Jun  7 18:52:30 Emerald kernel:  disk 4, o:1, dev:sdg1
Jun 7 18:54:37 Emerald kernel: Buffer I/O error on device dm-2, logical block 0
Jun  7 18:54:37 Emerald kernel: lost page write due to I/O error on dm-2
Jun 7 18:57:11 Emerald kernel: Buffer I/O error on device md0, logical block 488383472 Jun 7 18:57:11 Emerald kernel: Buffer I/O error on device md0, logical block 488383472 Jun 7 18:57:11 Emerald kernel: Buffer I/O error on device md0, logical block 488383486 Jun 7 18:57:11 Emerald kernel: Buffer I/O error on device md0, logical block 488383486
Jun  7 19:05:10 Emerald kernel: md: unbind<sde1>
Jun  7 19:05:10 Emerald kernel: md: export_rdev(sde1)
Jun  7 19:05:15 Emerald kernel: md: bind<sde1>

but wait a sec,.. WTF is this sdg1 in the raid printout?....
reading back in the syslog, I see

Jun 7 18:47:26 Emerald kernel: SCSI device sdg: 976773168 512-byte hdwr sectors (500108 MB)
Jun  7 18:47:26 Emerald kernel: sdg: Write Protect is off
Jun  7 18:47:26 Emerald kernel: SCSI device sdg: drive cache: write back
Jun 7 18:47:26 Emerald kernel: SCSI device sdg: 976773168 512-byte hdwr sectors (500108 MB)
Jun  7 18:47:26 Emerald kernel: sdg: Write Protect is off
Jun  7 18:47:26 Emerald kernel: SCSI device sdg: drive cache: write back
Jun  7 18:47:26 Emerald kernel:  sdg: sdg1
Jun  7 18:47:26 Emerald kernel: sd 6:0:0:0: Attached scsi disk sdg

well thats nice, thats my pendrive! so what happened when it setup the array?

Jun  7 18:47:30 Emerald kernel: md: Autodetecting RAID arrays.
Jun  7 18:47:30 Emerald kernel: md: autorun ...
Jun  7 18:47:30 Emerald kernel: md: considering sdg1 ...
Jun  7 18:47:30 Emerald kernel: md:  adding sdg1 ...
Jun  7 18:47:30 Emerald kernel: md:  adding sdf1 ...
Jun  7 18:47:30 Emerald kernel: md:  adding sde1 ...
Jun  7 18:47:30 Emerald kernel: md:  adding sdd1 ...
Jun  7 18:47:30 Emerald kernel: md:  adding sdb1 ...
Jun  7 18:47:30 Emerald kernel: md: created md0
Jun  7 18:47:30 Emerald kernel: md: bind<sdb1>
Jun  7 18:47:31 Emerald kernel: md: bind<sdd1>
Jun  7 18:47:31 Emerald kernel: md: bind<sde1>
Jun  7 18:47:31 Emerald kernel: md: bind<sdf1>
Jun  7 18:47:31 Emerald kernel: md: bind<sdg1>
Jun  7 18:47:31 Emerald kernel: md: running: <sdg1><sdf1><sde1><sdd1><sdb1>
Jun  7 18:47:31 Emerald kernel: md: kicking non-fresh sdf1 from array!
Jun  7 18:47:31 Emerald kernel: md: unbind<sdf1>
Jun  7 18:47:31 Emerald kernel: md: export_rdev(sdf1)
Jun 7 18:47:31 Emerald kernel: raid5: automatically using best checksumming function: pIII_sse
Jun  7 18:47:31 Emerald kernel:    pIII_sse  :  4203.000 MB/sec
Jun 7 18:47:31 Emerald kernel: raid5: using function: pIII_sse (4203.000 MB/sec)
Jun  7 18:47:31 Emerald kernel: md: raid5 personality registered for level 5
Jun  7 18:47:31 Emerald kernel: md: raid4 personality registered for level 4
Jun 7 18:47:31 Emerald kernel: raid5: device sdg1 operational as raid disk 4 Jun 7 18:47:31 Emerald kernel: raid5: device sde1 operational as raid disk 2 Jun 7 18:47:31 Emerald kernel: raid5: device sdd1 operational as raid disk 1 Jun 7 18:47:31 Emerald kernel: raid5: device sdb1 operational as raid disk 0
Jun  7 18:47:31 Emerald kernel: raid5: allocated 5248kB for md0
Jun 7 18:47:31 Emerald kernel: raid5: raid level 5 set md0 active with 4 out of 5 devices, algorithm 2
Jun  7 18:47:31 Emerald kernel: RAID5 conf printout:
Jun  7 18:47:31 Emerald kernel:  --- rd:5 wd:4 fd:1
Jun  7 18:47:31 Emerald kernel:  disk 0, o:1, dev:sdb1
Jun  7 18:47:31 Emerald kernel:  disk 1, o:1, dev:sdd1
Jun  7 18:47:31 Emerald kernel:  disk 2, o:1, dev:sde1
Jun  7 18:47:31 Emerald kernel:  disk 4, o:1, dev:sdg1
Jun  7 18:47:31 Emerald kernel: md: ... autorun DONE.

WHAT THE HELL?!??
*considering sdg1* ?!?! then deciding it was fair game to use?!??
it's a FAT16 FS pendrive with NO UUID stuff on it...
suddenly the RAID5 gets very unhappy and becomes a RID5 and I spend the next few hours rebuilding it (fortunately all data was preserved but it wasn't a pleasant evening I can tell you)

Hum ho,.. I survived the horror but umm, well, I'll leave the above as a story to frighten young sysadmins with.

Phil
=--=


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux