Since some time one of my systems "freezes" after limited uptime (a few hours), usually during package compilation process. This seems to happen only with recent kernel versions (2.6.27-rc*), don't remember if it also happened with 2.6.26 (though I'm pretty sure it did not happen with early 2.6.2x series) Unfortunately this always shutdowns the root filesystem rendering system unusable. The kernel output below was generated by 2.6.27-rc5-git9, same symptoms happened with other -rc releases of 2.6.27 though I couldn't look at dmesg because it happens to / and I only enabled networked syslog pretty recently on that box in order to find out what happens. Unfortunately either the chipset or the BIOS do not support AHCI, for the SATA controller as the only choice for SATA offered by BIOS is: IDE. Is this a known issue? At least there seem to be similar ATA exceptions happening lately according to search results returned by google when looking for the error messages (exception and originating command). -- improvement suggestion -- To keep the system running it would be nice if the failing command could be re-issued after resetting the link and rediscovering the drive, that is, pushing the error to upper layers only after new failure when retrying the operation following the reset. -- end of suggestion -- If kernel config or complete output of dmesg is of some help, please let me know. In case there are some tuning options to try in order to pinpoint the cause I can try them out, that system is not in production use. (according to some of the messages I found it could be related to drive cache flushing) Bruno lspci output: 00:00.0 Host bridge [0600]: VIA Technologies, Inc. CX700 Host Bridge [1106:0324] (rev 03) 00:00.1 Host bridge [0600]: VIA Technologies, Inc. CX700 Host Bridge [1106:1324] 00:00.2 Host bridge [0600]: VIA Technologies, Inc. CX700 Host Bridge [1106:2324] 00:00.3 Host bridge [0600]: VIA Technologies, Inc. CX700 Host Bridge [1106:3324] 00:00.4 Host bridge [0600]: VIA Technologies, Inc. CX700 Host Bridge [1106:4324] 00:00.7 Host bridge [0600]: VIA Technologies, Inc. CX700 Host Bridge [1106:7324] 00:01.0 PCI bridge [0604]: VIA Technologies, Inc. VT8237 PCI Bridge [1106:b198] 00:0f.0 IDE interface [0101]: VIA Technologies, Inc. Device [1106:0581] 00:10.0 USB Controller [0c03]: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller [1106:3038] (rev 90) 00:10.1 USB Controller [0c03]: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller [1106:3038] (rev 90) 00:10.2 USB Controller [0c03]: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller [1106:3038] (rev 90) 00:10.4 USB Controller [0c03]: VIA Technologies, Inc. USB 2.0 [1106:3104] (rev 90) 00:11.0 ISA bridge [0601]: VIA Technologies, Inc. CX700 PCI to ISA Bridge [1106:8324] 00:11.7 Host bridge [0600]: VIA Technologies, Inc. CX700 Internal Module Bus [1106:324e] 00:13.0 Host bridge [0600]: VIA Technologies, Inc. CX700 Host Bridge [1106:324b] 00:13.1 PCI bridge [0604]: VIA Technologies, Inc. CX700 PCI to PCI Bridge [1106:324a] 01:00.0 VGA compatible controller [0300]: VIA Technologies, Inc. CX700M2 UniChrome PRO II Graphics [1106:3157] (rev 03) 02:08.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet [10ec:8169] (rev 10) 80:01.0 Audio device [0403]: VIA Technologies, Inc. VIA High Definition Audio Controller [1106:3288] (rev 10) Hard-drive details as reported by hdparm -I: /dev/sda: ATA device, with non-removable media Model Number: FUJITSU MHY2250BH Serial Number: K407T7A25THF Firmware Revision: 0000000B Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5; Revision: ATA8-AST T13 Project D1697 Revision 0b Standards: Used: ATA-8-ACS revision 3f Supported: 8 7 6 5 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 488397168 device size with M = 1024*1024: 238475 MBytes device size with M = 1000*1000: 250059 MBytes (250 GB) Capabilities: LBA, IORDY(can be disabled) Queue depth: 32 Standby timer values: spec'd by Standard, no device specific minimum R/W multiple sector transfer: Max = 16 Current = 16 Advanced power management level: 128 Recommended acoustic management value: 254, current value: 254 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead Host Protected Area feature set * WRITE_BUFFER command * READ_BUFFER command * DOWNLOAD_MICROCODE * Advanced Power Management feature set SET_MAX security extension * Automatic Acoustic Management feature set * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * WRITE_{DMA|MULTIPLE}_FUA_EXT * 64-bit World wide name * IDLE_IMMEDIATE with UNLOAD Disable Data Transfer After Error Detection * WRITE_UNCORRECTABLE_EXT command * {READ,WRITE}_DMA_EXT_GPL commands * Segmented DOWNLOAD_MICROCODE * SATA-I signaling speed (1.5Gb/s) * Native Command Queueing (NCQ) * Host-initiated interface power management * Phy event counters DMA Setup Auto-Activate optimization Device-initiated interface power management * Software settings preservation * SMART Command Transport (SCT) feature set * SCT LBA Segment Access (AC2) * SCT Error Recovery Control (AC3) * SCT Features Control (AC4) * SCT Data Tables (AC5) Security: Master password revision code = 65534 supported not enabled not locked not frozen not expired: security count not supported: enhanced erase 250min for SECURITY ERASE UNIT. Logical Unit WWN Device Identifier: 5000e040f1a7bd NAA : 5 IEEE OUI : e Unique ID : 040f1a7bd Checksum: correct Kernel messages related to driver initialization: [ 2.568109] pata_via 0000:00:0f.0: version 0.3.3 [ 2.568313] scsi0 : pata_via [ 2.568748] scsi1 : pata_via [ 2.573314] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xff00 irq 14 [ 2.573418] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xff08 irq 15 [ 2.760280] ata1.00: ATA-8: FUJITSU MHY2250BH, 0000000B, max UDMA/100 [ 2.760422] ata1.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 0/32) [ 2.800304] ata1.00: configured for UDMA/100 [ 2.971844] scsi 0:0:0:0: Direct-Access ATA FUJITSU MHY2250B 0000 PQ: 0 ANSI: 5 [ 2.972976] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB) [ 2.973192] sd 0:0:0:0: [sda] Write Protect is off [ 2.973321] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 2.973453] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 2.973938] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB) [ 2.974142] sd 0:0:0:0: [sda] Write Protect is off [ 2.974270] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 2.974399] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 2.974588] sda: sda1 sda2 sda3 sda4 sda5 sda6 [ 3.201488] sd 0:0:0:0: [sda] Attached SCSI disk Kernel error output related to XFS shutdown: [ 9352.420180] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [ 9352.420247] ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 [ 9352.420261] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [ 9352.420289] ata1.00: status: { DRDY } [ 9352.420353] ata1: soft resetting link [ 9352.650317] ata1.00: configured for UDMA/100 [ 9352.650374] end_request: I/O error, dev sda, sector 6410215 [ 9352.650432] ata1: EH complete [ 9352.650654] I/O error in filesystem ("sda3") meta-data dev sda3 block 0x203fa6 ("xlog_iodone") error 5 buf count 32768 [ 9352.650824] xfs_force_shutdown(sda3,0x2) called from line 1027 of file /usr/src/linux-2.6.27-rc5-git9/fs/xfs/xfs_log.c. Return address = 0xc020ccba [ 9352.651304] Filesystem "sda3": Log I/O Error Detected. Shutting down filesystem: sda3 [ 9352.651332] Please umount the filesystem, and rectify the problem(s) [ 9352.651395] xfs_force_shutdown(sda3,0x2) called from line 790 of file /usr/src/linux-2.6.27-rc5-git9/fs/xfs/xfs_log.c. Return address = 0xc020dfce [ 9352.654454] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB) [ 9352.659345] xfs_force_shutdown(sda3,0x2) called from line 790 of file /usr/src/linux-2.6.27-rc5-git9/fs/xfs/xfs_log.c. Return address = 0xc020dfce [ 9352.988239] sd 0:0:0:0: [sda] Write Protect is off [ 9352.988277] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 9353.026123] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 9383.090107] Filesystem "sda3": xfs_log_force: error 5 returned. [ 9413.090091] Filesystem "sda3": xfs_log_force: error 5 returned. [ 9443.090112] Filesystem "sda3": xfs_log_force: error 5 returned. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html