On Wed, Aug 29, 2012 at 09:06:28AM -0400, Joe Landman wrote: > We've found modern LSI > HBA and RAID gear have had issues with occasional "events" that seem > to be more firmware bugs or driver bugs than anything else. The > gear is stable for very light usage, but when pushed hard (without > driver/fw updates), it does crash, hard, often with corruption. That's what I was afraid of :-( Last week I set about reproducing this problem again on some test boxes, and most annoyingly, I have been unable to. The test ran for about 5 days before one of the (Seagate) hard drives had an I/O error over the weekend, and XFS shut down as you said it would. I've just moved the remaining drives to another box, but after an hour it hasn't failed either. These boxes are identical specs to the production boxes. The production ones may get their filesystems wiped soon anyway, in which case I can try reproducing on the actual same boxes. > xfs is a parallel IO file system, ext4 is not. There is a very good > chance you are tickling a bug lower in the stack. Which LSI HBA or > RAID are you using? HBAs, one 8 port and one 16 port. root at dev-storage2:~# ./sas2flash -listall LSI Corporation SAS2 Flash Utility Version 12.00.00.00 (2011.11.08) Copyright (c) 2008-2011 LSI Corporation. All rights reserved Adapter Selected is a LSI SAS: SAS2116_1(B1) Num Ctlr FW Ver NVDATA x86-BIOS PCI Addr ---------------------------------------------------------------------------- 0 SAS2116_1(B1) 12.00.00.00 0c.00.00.01 07.23.01.00 00:02:00:00 1 SAS2008(B2) 12.00.00.00 0c.00.00.05 07.23.01.00 00:03:00:00 Finished Processing Commands Successfully. Exiting SAS2Flash. > How have you set this up? mdadm --create /dev/md/huge -n 24 -c 1024 -l raid0 /dev/sd{b..y} mkfs.xfs -f -n size=16384 /dev/md/huge > What kernel rev ubuntu 12.04, stock kernel 3.2.0-26 (a bit behind on updates; 3.2.0-29 is latest) > and whats the > > modinfo mpt2sas > lspci > uname -a > > output? root at dev-storage2:~# modinfo mpt2sas filename: /lib/modules/3.2.0-26-generic/kernel/drivers/scsi/mpt2sas/mpt2sas.ko version: 10.100.00.00 license: GPL description: LSI MPT Fusion SAS 2.0 Device Driver author: LSI Corporation <DL-MPTFusionLinux at lsi.com> srcversion: 44529298D89618E1BA4A0EC alias: pci:v00001000d0000007Esv*sd*bc*sc*i* alias: pci:v00001000d0000006Esv*sd*bc*sc*i* alias: pci:v00001000d00000087sv*sd*bc*sc*i* alias: pci:v00001000d00000086sv*sd*bc*sc*i* alias: pci:v00001000d00000085sv*sd*bc*sc*i* alias: pci:v00001000d00000084sv*sd*bc*sc*i* alias: pci:v00001000d00000083sv*sd*bc*sc*i* alias: pci:v00001000d00000082sv*sd*bc*sc*i* alias: pci:v00001000d00000081sv*sd*bc*sc*i* alias: pci:v00001000d00000080sv*sd*bc*sc*i* alias: pci:v00001000d00000065sv*sd*bc*sc*i* alias: pci:v00001000d00000064sv*sd*bc*sc*i* alias: pci:v00001000d00000077sv*sd*bc*sc*i* alias: pci:v00001000d00000076sv*sd*bc*sc*i* alias: pci:v00001000d00000074sv*sd*bc*sc*i* alias: pci:v00001000d00000072sv*sd*bc*sc*i* alias: pci:v00001000d00000070sv*sd*bc*sc*i* depends: scsi_transport_sas,raid_class intree: Y vermagic: 3.2.0-26-generic SMP mod_unload modversions parm: logging_level: bits for enabling additional logging info (default=0) parm: max_sectors:max sectors, range 64 to 8192 default=8192 (ushort) parm: max_lun: max lun, default=16895 (int) parm: max_queue_depth: max controller queue depth (int) parm: max_sgl_entries: max sg entries (int) parm: msix_disable: disable msix routed interrupts (default=0) (int) parm: missing_delay: device missing delay , io missing delay (array of int) parm: mpt2sas_fwfault_debug: enable detection of firmware fault and halt firmware - (default=0) parm: disable_discovery: disable discovery (int) parm: diag_buffer_enable: post diag buffers (TRACE=1/SNAPSHOT=2/EXTENDED=4/default=0) (int) root at dev-storage2:~# lspci 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 Processor Family DRAM Controller (rev 09) 00:01.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port (rev 09) 00:01.1 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port (rev 09) 00:06.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port (rev 09) 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 05) 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b5) 00:1c.1 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 2 (rev b5) 00:1c.2 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 3 (rev b5) 00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 4 (rev b5) 00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b5) 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 05) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5) 00:1f.0 ISA bridge: Intel Corporation C204 Chipset Family LPC Controller (rev 05) 00:1f.2 IDE interface: Intel Corporation 6 Series/C200 Series Chipset Family 4 port SATA IDE Controller (rev 05) 00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 05) 00:1f.5 IDE interface: Intel Corporation 6 Series/C200 Series Chipset Family 2 port SATA IDE Controller (rev 05) 01:00.0 Ethernet controller: Intel Corporation 82599EB 10 Gigabit TN Network Connection (rev 01) 01:00.1 Ethernet controller: Intel Corporation 82599EB 10 Gigabit TN Network Connection (rev 01) 02:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] (rev 02) 03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03) 04:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 02) 05:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 10) 06:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection 07:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection 08:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection root at dev-storage2:~# uname -a Linux dev-storage2.example.com 3.2.0-26-generic #41-Ubuntu SMP Thu Jun 14 17:49:24 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Anyway, many thanks for sharing your experience. This was definitely reproducible before, I'll come back when I can reproduce it again :-( Regards, Brian.