Hello, after upgrading from 3.8.7 to 3.9.x I noticed some slightly longer delays when resuming from suspend-to-disk and a few new error messages in the logs: ata1: link is slow to respond, please be patient (ready=0) ata3: link is slow to respond, please be patient (ready=0) ata1: SRST failed (errno=-16) ata3: SRST failed (errno=-16) ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ACPI cmd ef/03:46:00:00:00:a0 (SET FEATURES) filtered out ata1.00: configured for UDMA/133 sd 0:0:0:0: [sda] Starting disk ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata3.00: ACPI cmd ef/03:46:00:00:00:a0 (SET FEATURES) filtered out ata3.00: configured for UDMA/133 sd 2:0:0:0: [sdc] Starting disk PM: restore of devices complete after 11448.044 msecs (compared to ~2900 msecs with 3.8.x). I didn't mind the messages since things were working anyway, but lately in a couple of cases some disks were dropped and disabled, with the obvious consequences for the functionality of the system: ata2: link is slow to respond, please be patient (ready=0) ata2: SRST failed (errno=-16) ata2: link is slow to respond, please be patient (ready=0) ata2: SRST failed (errno=-16) ata2: link is slow to respond, please be patient (ready=0) ata2: SRST failed (errno=-16) ata2: limiting SATA link speed to 1.5 Gbps ata2: SRST failed (errno=-16) ata2: reset failed, giving up ata2.00: disabled sd 1:0:0:0: [sdb] Starting disk sd 1:0:0:0: [sdb] START_STOP FAILED sd 1:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 dpm_run_callback(): scsi_bus_restore+0x0/0x20 returns 262144 PM: Device 1:0:0:0 failed to restore async: error 262144 PM: restore of devices complete after 61130.778 msecs ... sd 1:0:0:0: [sdb] Unhandled error code sd 1:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 sd 1:0:0:0: [sdb] CDB: cdb[0]=0x28: 28 00 00 00 00 6f 00 00 08 00 end_request: I/O error, dev sdb, sector 111 XFS (sdb1): metadata I/O error: block 0x30 ("xfs_trans_read_buf_map") error 5 numblks 8 ffff88006f710000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ XFS (sdb1): Internal error xfs_dir2_leaf_verify at line 62 of file fs/xfs/xfs_dir2_leaf.c. Caller 0xffffffff811d17d8 Not having connected the issue with the previous error messages I initially blamed the disk and replaced it with a new one thinking it was broken, but the problem persisted so I started a bisection run that pointed to this commit: b8bb6cb999858043489c1ddef08eed2127559169 is the first bad commit commit b8bb6cb999858043489c1ddef08eed2127559169 Author: Zhang Rui <rui.zhang@xxxxxxxxx> Date: Thu Nov 22 15:45:02 2012 +0800 step_wise: Unify the code for both throttle and dethrottle Signed-off-by: Zhang Rui <rui.zhang@xxxxxxxxx> :040000 040000 8d313c2dd1dacca2dd01128cf48bb9804e95f258 62bf2c47f021832f423172db87f9e367a91460f2 M drivers git bisect start # bad: [c1be5a5b1b355d40e6cf79cc979eb66dafa24ad1] Linux 3.9 git bisect bad c1be5a5b1b355d40e6cf79cc979eb66dafa24ad1 # good: [19f949f52599ba7c3f67a5897ac6be14bfcb1200] Linux 3.8 git bisect good 19f949f52599ba7c3f67a5897ac6be14bfcb1200 # good: [d778df51c09264076fe0208c099ef7d428f21790] mm: vmscan: save work scanning (almost) empty LRU lists git bisect good d778df51c09264076fe0208c099ef7d428f21790 # good: [ee89f81252179dcbf6cd65bd48299f5e52292d88] Merge branch 'for-3.9/core' of git://git.kernel.dk/linux-block git bisect good ee89f81252179dcbf6cd65bd48299f5e52292d88 # bad: [fa4a6732a8e6153435941cb730d7d54c8367fe72] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security git bisect bad fa4a6732a8e6153435941cb730d7d54c8367fe72 # bad: [37cae6ad4c484030fa972241533c32730ec79b7d] Merge tag 'dm-3.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm git bisect bad 37cae6ad4c484030fa972241533c32730ec79b7d # bad: [1a32c58bb945970e56f27a1cfb61625a3ac0b88e] Merge tag 'late-mvebu-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc git bisect bad 1a32c58bb945970e56f27a1cfb61625a3ac0b88e # good: [b6669737d3db7df79fad07180837c23dbe581db5] Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux git bisect good b6669737d3db7df79fad07180837c23dbe581db5 # bad: [7307c00f335a4e986586b12334696098d2fc2bcd] Merge tag 'late-omap' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc git bisect bad 7307c00f335a4e986586b12334696098d2fc2bcd # bad: [f8f466c81795a3ed2b8a74c8feebc280aec3db81] Merge tag 'late-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc git bisect bad f8f466c81795a3ed2b8a74c8feebc280aec3db81 # bad: [4f0a6847815837b63b05fc23878ba391701d8f6a] Thermal: exynos: Add support for temperature falling interrupt. git bisect bad 4f0a6847815837b63b05fc23878ba391701d8f6a # bad: [d6d71ee4a14ae602db343ec48c491851d7ec5267] PM: Introduce Intel PowerClamp Driver git bisect bad d6d71ee4a14ae602db343ec48c491851d7ec5267 # bad: [c313637641c3e33388903e16cfde97b2e67adb9e] thermal: db8500: Use of_match_ptr() macro in db8500_thermal.c git bisect bad c313637641c3e33388903e16cfde97b2e67adb9e # bad: [bbf63be4f331358173da26b888a10583fcc92ec0] Thermal: exynos: Add sysfs node supporting exynos's emulation mode. git bisect bad bbf63be4f331358173da26b888a10583fcc92ec0 # good: [3dbfff3dfe6714aeefb615c65bec0800dc5a4c51] Introduce THERMAL_TREND_RAISE/DROP_FULL support for step_wise governor git bisect good 3dbfff3dfe6714aeefb615c65bec0800dc5a4c51 # bad: [b8bb6cb999858043489c1ddef08eed2127559169] step_wise: Unify the code for both throttle and dethrottle git bisect bad b8bb6cb999858043489c1ddef08eed2127559169 Reverting it fixed the problem, as far as I can see. Any idea of what could be going wrong here? Thank you, Giacomo Perale
Attachment:
dmesg.3.9.0
Description: Binary data