Subject : Raid-5 autodetection problem OS : Suse 7.3 for Sparc64 Kernel : 2.4.19-pre4 Hardware: Sun Ultra10 512MB RAM, 1x 20GB SUN Ide HD (/dev/hda), 3 x scsi-HD in software RAID-5 config (/dev/md0 and /dev/md1) attached to a Tekram DC390U2B Ultra2 controller. Severity: I need a drink... Hello, I have a problem with my SUN Ultra10 machine. I have 3 scsi-disc's connected to a Tekram scsi-controller (sym53c8xx driver) i boot from the onboard IDE disk /dev/hda The scsi-disc's are all part of a perfectly working Raid-5 array i copied the contents of the IDE disk to /dev/md0 i modified the, to the array copied, /etc/fstab to say that / sits on /dev/md0 All drivers are built in to the kernel Made a new SILO entry with root=/dev/md0 so i can switch from booting with root on the IDE disc and on the Raid5 array. The problem is that the array is detected but not in time (too late) After the scsi-driver is loaded the raid5 driver is loaded and the raid5 speed is calculated etc. Then a auto-detect is done which does NOT find the array at all. Then the kernel tries to mount / which of course fails because there is no /dev/md0 yet The persistent superblock is present on all disks. The first partition starts on cylinder 1 instead of 0 because of the "after reboot the part.table is hosed" problem. When the 1st partition starts on cyl.1 the partition table stays perfectly intact. when booting with "/ on the IDE disk" i noticed that the first time RAID auto-detect is run the array is not found at all, but after the root is mounted etc (VFS mounted root) there is another autodetect which DOES detect the array properly. IMHO: my problem is that the first array autodetect failes. It it would detect it the first time around the /dev/md0 device would exist BEFORE the root mount and i would be a happy guy 8-) i included the relevant part of the boot.msg. It explains what i mean. The interesting part is right after the scsi-driver loading part. Inspecting /boot/System.map-2.4.19-pre4 Loaded 16938 symbols from /boot/System.map-2.4.19-pre4. Symbols match kernel version 2.4.19. No module symbols loaded. klogd 1.4.1, log source = ksyslog started. <4>PROMLIB: Sun IEEE Boot Prom 3.25.3 2000/06/29 14:12 <4>Linux version 2.4.19-pre4 (root@picasso) (gcc version egcs-2.92.11 19980921 (gcc2 ss-980609 experimental)) #2 Tue Mar 26 23:35:26 WET 2002 <4>ARCH: SUN4U -- snip snip -- <6>loop: loaded (max 8 devices) <6>SCSI subsystem driver Revision: 1.00 <6>sym.2.2.0: setting PCI_COMMAND_PARITY... <6>sym.2.2.0: setting PCI_COMMAND_INVALIDATE. <6>sym0: <895> rev 0x1 on pci bus 2 device 2 function 0 irq 4,7d4 <4>sym0: Tekram NVRAM, ID 7, Fast-40, LVD, parity checking <5>sym0: SCSI BUS has been reset. <6>scsi0 : sym-2.1.17a <4> Vendor: FUJITSU Model: MAH3182MP Rev: 0115 <4> Type: Direct-Access ANSI SCSI revision: 04 <4> Vendor: FUJITSU Model: MAH3182MP Rev: 0115 <4> Type: Direct-Access ANSI SCSI revision: 04 <4> Vendor: FUJITSU Model: MAH3182MP Rev: 0115 <4> Type: Direct-Access ANSI SCSI revision: 04 <6>sym0:0:0: tagged command queuing enabled, command queue depth 32. <6>sym0:1:0: tagged command queuing enabled, command queue depth 32. <6>sym0:2:0: tagged command queuing enabled, command queue depth 32. <4>Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 <4>Attached scsi disk sdb at scsi0, channel 0, id 1, lun 0 <4>Attached scsi disk sdc at scsi0, channel 0, id 2, lun 0 <6>sym0:0: FAST-40 WIDE SCSI 80.0 MB/s ST (25.0 ns, offset 31) <4>SCSI device sda: 35701260 512-byte hdwr sectors (18279 MB) <6> sda: sda1 sda3 sda4 <6>sym0:1: FAST-40 WIDE SCSI 80.0 MB/s ST (25.0 ns, offset 31) <4>SCSI device sdb: 35701260 512-byte hdwr sectors (18279 MB) <6> sdb: sdb1 sdb3 sdb4 <6>sym0:2: FAST-40 WIDE SCSI 80.0 MB/s ST (25.0 ns, offset 31) <4>SCSI device sdc: 35701260 512-byte hdwr sectors (18279 MB) <6> sdc: sdc1 sdc3 sdc4 <4>OBP Flash: RD 1fff0000000[100000] WR 1fff0000000[100000] <6>md: raid5 personality registered as nr 4 <6>raid5: measuring checksumming speed <4> VIS : 142.400 MB/sec <4>raid5: using function: VIS (142.400 MB/sec) <6>md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 <6>md: Autodetecting RAID arrays. <6>md: autorun ... <6>md: ... autorun DONE. <6>NET4: Linux TCP/IP 1.0 for NET4.0 <6>IP Protocols: ICMP, UDP, TCP <4>IP: routing cache hash table of 4096 buckets, 64Kbytes <4>TCP: Hash tables configured (established 32768 bind 32768) <6>NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. <4>VFS: Mounted root (ext2 filesystem) readonly. <4>sys32_ioctl(showconsole:22): Unknown cmd fd(0) cmd(40045432) arg(effffc04) <6> [events: 0000001a] <6> [events: 0000001a] <6> [events: 0000001a] <6>md: autorun ... <6>md: considering sdc1 ... <6>md: adding sdc1 ... <6>md: adding sdb1 ... <6>md: adding sda1 ... <6>md: created md0 <6>md: bind<sda1,1> <6>md: bind<sdb1,2> <6>md: bind<sdc1,3> <6>md: running: <sdc1><sdb1><sda1> <6>md: sdc1's event counter: 0000001a <6>md: sdb1's event counter: 0000001a <6>md: sda1's event counter: 0000001a <6>md0: max total readahead window set to 512k <6>md0: 2 data-disks, max readahead per data-disk: 256k <6>raid5: device sdc1 operational as raid disk 2 <6>raid5: device sdb1 operational as raid disk 1 <6>raid5: device sda1 operational as raid disk 0 <6>raid5: allocated 6566kB for md0 <4>raid5: raid level 5 set md0 active with 3 out of 3 devices, algorithm 2 <4>RAID5 conf printout: <4> --- rd:3 wd:3 fd:0 <4> disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sda1 <4> disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdb1 <4> disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdc1 <4>RAID5 conf printout: <4> --- rd:3 wd:3 fd:0 <4> disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sda1 <4> disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdb1 <4> disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdc1 <6>md: updating md0 RAID superblock on device <6>md: sdc1 [events: 0000001b]<6>(write) sdc1's sb offset: 2047936 <6>md: sdb1 [events: 0000001b]<6>(write) sdb1's sb offset: 2047936 <6>md: sda1 [events: 0000001b]<6>(write) sda1's sb offset: 2047936 <6>md: ... autorun DONE. <6> [events: 0000001a] <6> [events: 0000001a] <6> [events: 0000001a] <6>md: autorun ... <6>md: considering sdc4 ... <6>md: adding sdc4 ... <6>md: adding sdb4 ... <6>md: adding sda4 ... <6>md: created md1 <6>md: bind<sda4,1> <6>md: bind<sdb4,2> <6>md: bind<sdc4,3> <6>md: running: <sdc4><sdb4><sda4> <6>md: sdc4's event counter: 0000001a <6>md: sdb4's event counter: 0000001a <6>md: sda4's event counter: 0000001a <6>md1: max total readahead window set to 512k <6>md1: 2 data-disks, max readahead per data-disk: 256k <6>raid5: device sdc4 operational as raid disk 2 <6>raid5: device sdb4 operational as raid disk 1 <6>raid5: device sda4 operational as raid disk 0 <6>raid5: allocated 6566kB for md1 <4>raid5: raid level 5 set md1 active with 3 out of 3 devices, algorithm 2 <4>RAID5 conf printout: <4> --- rd:3 wd:3 fd:0 <4> disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sda4 <4> disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdb4 <4> disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdc4 <4>RAID5 conf printout: <4> --- rd:3 wd:3 fd:0 <4> disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sda4 <4> disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdb4 <4> disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdc4 <6>md: updating md1 RAID superblock on device <6>md: sdc4 [events: 0000001b]<6>(write) sdc4's sb offset: 15798208 <6>md: sdb4 [events: 0000001b]<6>(write) sdb4's sb offset: 15798208 <6>md: sda4 [events: 0000001b]<6>(write) sda4's sb offset: 15798208 <6>md: ... autorun DONE. <6>Adding Swap: 524648k swap-space (priority 42) <4>raid5: switching cache buffer size, 8192 --> 1024 <4>raid5: switching cache buffer size, 1024 --> 4096 <6>kjournald starting. Commit interval 5 seconds <6>EXT3 FS 2.4-0.9.17, 10 Jan 2002 on md(9,0), internal journal <6>EXT3-fs: mounted filesystem with ordered data mode. <4>raid5: switching cache buffer size, 8192 --> 1024 <4>raid5: switching cache buffer size, 1024 --> 4096 <6>kjournald starting. Commit interval 5 seconds <6>EXT3 FS 2.4-0.9.17, 10 Jan 2002 on md(9,1), internal journal <6>EXT3-fs: mounted filesystem with ordered data mode. -- snip snip -- See, it does detect the array the second time but by then it is too late for a non IDE boot bu booting with /dev/md0 instead. It is the 2.4.19-pre4 kernel from kernel.org by the way Oh, by the way. Does anyone know why the kernel switches raid5 cache buffer size like that ? Can't i set it to a fixed value, and maybe make it a bit bigger (if it improves performance) ? - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html