Re: Scary Intel SATA problem: "frozen"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Tejun Heo wrote:
> Jonas Lundgren wrote:
> [--snip--]
>> Also, it doesn't matter if I enable AHCI in the BIOS (But with AHCI
>> enabled the disks spin down/power down when I boot, just to power up
>> again a few seconds after. The boot progress freezes until the disks
>> have spun up again. (This happens when the kernel probes the sata
>> controller ports at bootup, the disks spin down at the same time, but
>> spin up one by one as they're getting probed))
> 
> Likely fix is pending for this problem.
> 
>> I've tried changing I/O scheduler, only noticable diffrence is when I
>> use "noop". Then I get like 20mb/sec write instead of 4mb/sec. I have no
>> idea why this is :P
>>
>> Example of what I mean with crappy performance:
>> dd if=/dev/zero of=test232 bs=1M count=100; time sync
>> 100+0 records in
>> 100+0 records out
>> 104857600 bytes (105 MB) copied, 0.130424 s, 804 MB/s
>> real 0m21.104s
>> user 0m0.000s
>> sys 0m0.011s
>>
>> 21 seconds to do a seq write of 100mb.. And during this time ALL other
>> disk IO gets starved, I can't do anything that uses disk IO for the
>> duration.. (not even `ls`)
> 
> What does the kernel say during this writing?  Can you post the result
> of the following?
> 
> 1. reboot
> 2. dmesg -c
> 3. time dd if=/dev/zero.. blah
> 4. dmesg
> 
> Also, does 'mount -o remount,barrier=0 /' change anything?

I will post this info as soon as I can "reproduce" the error.

> 
>> Yet, a hdparm shows a decent read
>> hdparm -tT /dev/md4
>> /dev/md4:
>> Timing cached reads: 8060 MB in 1.99 seconds = 4042.19 MB/sec
>> Timing buffered disk reads: 400 MB in 3.00 seconds = 133.28 MB/sec
>>
>> dd if=1GBzeroFile of=/dev/null bs=1M count=1000
>> 1000+0 records in
>> 1000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 11.4335 s, 91.7 MB/s
>>
>> This is the cpu usage stats I get from top when running the dd write:
>> Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 99.0%wa, 0.5%hi, 0.5%si, 0.0%st
>> Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>>
>> Pretty crappy read speeds compared to what I got on my previous mobo
>> (around 140mb/sec), but still alot better than the 4mb/sec I get when
>> writing..
> 
> Which controller did you use on your previous mobo?  If you're using
> ata_piix and hook two hard drives as primary and secondary on the same
> channel, some level of performance degradation is expected.  ata_piix
> can only issue command to only one of the two drives at once.  Is the
> read performance still bad in ahci mode?

Atm I run the ICH8 SATA ports in AHCI mode with "IDE bus master"(To be
honest I don't really know what this option does, no info about it in
the BIOS nor the mobo manual) turned off in BIOS. The drives are
connected to port 1, 3, 6 and 8 (raptor+raptor on 1+3, and WD 250G + WD
250G (also a raid0) on ports 6+8)

> 
> [--snip--]
>> Dmesg output from the error(s): (sda and sdb are 2 * 74GB raptor SATA
>> drives in a Linux software raid0)
>>
>> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata1.00: (BMDMA stat 0x20)
>> ata1.00: tag 0 cmd 0xca Emask 0x4 stat 0x40 err 0x0 (timeout)
> 
> This might be a missed interrupt.  It's a write.  DMA engine is done
> finishing transferring all data.  Device is ready for the next command
> but the interrupt has never arrived.
> 
>> ata1: port is slow to respond, please be patient
>> ata1: port failed to respond (30 secs)
>> ata1: soft resetting port
>> ATA: abnormal status 0xD0 on port 0xFA07
>> ATA: abnormal status 0xD0 on port 0xFA07
>> ATA: abnormal status 0xD0 on port 0xFA07
>> ATA: abnormal status 0xD0 on port 0xFA07
>> ATA: abnormal status 0xD0 on port 0xFA07
>> ATA: abnormal status 0xD0 on port 0xFA07
>> ata1.00: qc timeout (cmd 0xec)
>> ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
>> ata1.00: revalidation failed (errno=-5)
>> ata1: failed to recover some devices, retrying in 5 secs
> 
> But this is weird.  If it were a missed interrupt, softreset should have
> recovered it instantly.  Something fishy is going on.
> 
> [--snip--]
>> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata1.00: (BMDMA stat 0x21)
>> ata1.00: tag 0 cmd 0xc8 Emask 0x4 stat 0x40 err 0x0 (timeout)
> 
> Same thing for read.
> 
>> ata1: port is slow to respond, please be patient
>> ata1: port failed to respond (30 secs)
> 
> Again, pre-reset wait times out.  Weird.
> 
>> ata1: soft resetting port
>> ata1.00: configured for UDMA/100
>> ata1: EH complete
> [--snip--]
>> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata1.00: (BMDMA stat 0x21)
>> ata1.00: tag 0 cmd 0xc8 Emask 0x4 stat 0x40 err 0x0 (timeout)
> 
> Again, for read.
> 
>> Most of the time when I get these errors the system will recover after
>> anything from 10 seconds to 10 minutes of unresponsiveness (no disk
>> I/O), and sometimes hang.
> 
> Yeap, libata needs stricter timing constraints for recovery.  That's
> high on to-do list.
> 
>> IF the system does recover, I start getting
>> the extremly low disk write speeds that I reported above, and only a
>> reboot will get the performance back to regular.
> 
> Please full dmesg after your computer got really slow.  I suspect libata
> decided to switch to PIO mode.

If that's so, how come I still get good read speeds? o.O

> 
>> I don't know what causes it, but most of the times when I've gotten it
>> my system has been under heavy load (compiling, downloading torrents in
>> 11mb/sec etc). Please let me know if you want any additional info, want
>> me to try something out, or whatever. My recent hardware upgrade for
>> around $1200 (to a core2duo system, i965 mobo) is just going to waste
>> because of this problem. :/
> 
> Heh, nice machine you got there.  When you look at the dmesg, do the
> error messages occur only on one of the two drives?  Or are both
> affected?  If only one is affected,

IIRC only sda is affected, and later today I'm gonna switch back to
non-AHCI mode and try to reproduce this error (This might be my
imagination, but it feels like I get the error more frequently if I
don't run the ports in AHCI mode..) so I can try out the things you've
listed here.. Would suck if there's a hardware problem with one of my
disks, but I guess it's possible.

> 
> 1. swap the two.  you'll probably have to dance a little bit with boot
> loader but md should handle that fine once the kernel is loaded.  does
> the errors persist?  on which device do they occur?  do they follow the
> drive or stay on the mobo port?

(I'm running my /boot on a raid1, so switching drives should require no
reconfiguration at all :)

> 
> 2. try different cable / port.  if you change port, again, you need to
> dance w/ boot loader.  who's carrying the error messages with it?
> 
> 3. try different power plug from different power lane.
> 
>> I just got so glad when I saw the post of this on linux-ide, I've been
>> searching like crazy to find another person having the same problem (and
>> possibly a solution) for the past 2-3 weeks or so.
> 
> My first guess is frequent transmission errors.  Please report the test
> results.  Thanks.
>

I've pushing my system really hard for half an hour or so to reproduce
this problem, and I got something else (no write speed slowdown, but
some page allocation errors, no idea if this has something to do with
anything, but I'll post it anyways)

My dmesg is filled up with this:

swapper: page allocation failure. order:0, mode:0x20
 [<c0143a29>] __alloc_pages+0x2ec/0x308
 [<c03507a7>] tcp_v4_rcv+0x984/0x99d
 [<c0159ef2>] cache_alloc_refill+0x2d0/0x507
 [<c015a1a5>] __kmalloc+0x7c/0x7e
 [<c031c901>] __alloc_skb+0x58/0x108
 [<c026e369>] rtl8169_rx_fill+0x7b/0x140
 [<c026e6ea>] rtl8169_rx_interrupt+0x2bc/0x413
 [<c027069e>] rtl8169_poll+0x3f/0x20a
 [<c0132fa7>] hrtimer_run_queues+0xc0/0x14b
 [<c03226ef>] net_rx_action+0x79/0x106
 [<c01221e4>] __do_softirq+0x73/0xd9
 [<c0122281>] do_softirq+0x37/0x39
 [<c01056b5>] do_IRQ+0x40/0x7b
 [<c01037df>] common_interrupt+0x23/0x28
 [<c01010fe>] mwait_idle_with_hints+0x39/0x3b
 [<c0101118>] mwait_idle+0x18/0x1c
 [<c010123b>] cpu_idle+0x64/0x79
 =======================
Mem-info:
DMA per-cpu:
CPU    0: Hot: hi:    0, btch:   1 usd:   0   Cold: hi:    0, btch:   1
usd:   0
CPU    1: Hot: hi:    0, btch:   1 usd:   0   Cold: hi:    0, btch:   1
usd:   0
Normal per-cpu:
CPU    0: Hot: hi:  186, btch:  31 usd:  30   Cold: hi:   62, btch:  15
usd:  60
CPU    1: Hot: hi:  186, btch:  31 usd:  30   Cold: hi:   62, btch:  15
usd:  13
HighMem per-cpu:
CPU    0: Hot: hi:  186, btch:  31 usd: 147   Cold: hi:   62, btch:  15
usd:  49
CPU    1: Hot: hi:  186, btch:  31 usd: 174   Cold: hi:   62, btch:  15
usd:  14
Active:149205 inactive:322562 dirty:155279 writeback:8923 unstable:0
free:3019 slab:36075 mapped:52151 pagetables:808
DMA free:3520kB min:68kB low:84kB high:100kB active:0kB inactive:6684kB
present:16256kB pages_scanned:64 all_unreclaimable? no
lowmem_reserve[]: 0 873 2015
Normal free:1404kB min:3744kB low:4680kB high:5616kB active:20436kB
inactive:714124kB present:894080kB pages_scanned:45728 all_unreclaimable? no
lowmem_reserve[]: 0 0 9135
HighMem free:7152kB min:512kB low:1736kB high:2960kB active:576384kB
inactive:569440kB present:1169292kB pages_scanned:7 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB
1*2048kB 0*4096kB = 3520kB
Normal: 1*4kB 1*8kB 7*16kB 0*32kB 0*64kB 0*128kB 1*256kB 0*512kB
1*1024kB 0*2048kB 0*4096kB = 1404kB
HighMem: 1140*4kB 96*8kB 80*16kB 3*32kB 1*64kB 1*128kB 1*256kB 0*512kB
0*1024kB 0*2048kB 0*4096kB = 7152kB
Swap cache: add 29537, delete 19491, find 7603/9179, race 1+0
Free swap  = 1704516kB
Total swap = 1767128kB
Free swap:       1704516kB
524000 pages of RAM
294624 pages of HIGHMEM
5482 reserved pages
486945 pages shared
10046 pages swap cached
155279 pages dirty
8798 pages writeback
52151 pages mapped
36075 pages slab
808 pages pagetables

printk: 892 messages suppressed.
ent:md4!: page allocation failure. order:0, mode:0x20
 [<c0143a29>] __alloc_pages+0x2ec/0x308
 [<c03507a7>] tcp_v4_rcv+0x984/0x99d
 [<c0159ef2>] cache_alloc_refill+0x2d0/0x507
 [<c015a1a5>] __kmalloc+0x7c/0x7e
 [<c031c901>] __alloc_skb+0x58/0x108
 [<c026e369>] rtl8169_rx_fill+0x7b/0x140
 [<c026e6ea>] rtl8169_rx_interrupt+0x2bc/0x413
 [<c027069e>] rtl8169_poll+0x3f/0x20a
 [<c01168f9>] find_busiest_group+0x124/0x4fd
 [<c0270ed7>] rtl8169_interrupt+0xce/0x22b
 [<c03226ef>] net_rx_action+0x79/0x106
 [<c01221e4>] __do_softirq+0x73/0xd9
 [<c0122281>] do_softirq+0x37/0x39
 [<c01056b5>] do_IRQ+0x40/0x7b
 [<c036fc19>] __sched_text_start+0x309/0x937
 [<c01037df>] common_interrupt+0x23/0x28
 [<c02256ed>] _atomic_dec_and_lock+0x1d/0x80
 [<c0179044>] generic_sync_sb_inodes+0xe2/0x249
 [<c016f8f9>] iput+0x41/0x77
 [<c01b4a8f>] put_wbq+0x13/0x21
 [<c01b4c27>] entd+0x18a/0x2b8
 [<c01b4a9d>] entd+0x0/0x2b8
 [<c0130345>] kthread+0xcc/0xf3
 [<c0130279>] kthread+0x0/0xf3
 [<c01039eb>] kernel_thread_helper+0x7/0x1c
 =======================
Mem-info:
DMA per-cpu:
CPU    0: Hot: hi:    0, btch:   1 usd:   0   Cold: hi:    0, btch:   1
usd:   0
CPU    1: Hot: hi:    0, btch:   1 usd:   0   Cold: hi:    0, btch:   1
usd:   0
Normal per-cpu:
CPU    0: Hot: hi:  186, btch:  31 usd:  31   Cold: hi:   62, btch:  15
usd:  53
CPU    1: Hot: hi:  186, btch:  31 usd:  30   Cold: hi:   62, btch:  15
usd:  51
HighMem per-cpu:
CPU    0: Hot: hi:  186, btch:  31 usd: 131   Cold: hi:   62, btch:  15
usd:  54
CPU    1: Hot: hi:  186, btch:  31 usd:  77   Cold: hi:   62, btch:  15
usd:  15
Active:188954 inactive:272272 dirty:147727 writeback:8547 unstable:0
free:1969 slab:47710 mapped:50495 pagetables:832
DMA free:3520kB min:68kB low:84kB high:100kB active:0kB inactive:5256kB
present:16256kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 873 2015
Normal free:1284kB min:3744kB low:4680kB high:5616kB active:23140kB
inactive:666264kB present:894080kB pages_scanned:19930 all_unreclaimable? no
lowmem_reserve[]: 0 0 9135
HighMem free:3072kB min:512kB low:1736kB high:2960kB active:732676kB
inactive:417568kB present:1169292kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB
1*2048kB 0*4096kB = 3520kB
Normal: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 1*256kB 0*512kB
1*1024kB 0*2048kB 0*4096kB = 1284kB
HighMem: 172*4kB 66*8kB 18*16kB 35*32kB 1*64kB 1*128kB 1*256kB 0*512kB
0*1024kB 0*2048kB 0*4096kB = 3072kB
Swap cache: add 29623, delete 19495, find 7636/9238, race 1+0
Free swap  = 1704512kB
Total swap = 1767128kB
Free swap:       1704512kB
524000 pages of RAM
294624 pages of HIGHMEM
5482 reserved pages
474337 pages shared
10128 pages swap cached
147727 pages dirty
8447 pages writeback
50495 pages mapped
47710 pages slab
832 pages pagetables


This happened after I kicked up a "dd if=/dev/mdX of=/dev/null" and "dd
if=/dev/zero of=tfile1" on both my raid0's (2 dd's per raid (1 read+1
write) per drive), started a recompile of kdelibs, and had a torrent
download running for 11mb/sec, ie pretty heavy load. (done with the
anticipatory scheduler)

I guess it could just be that the kernel couldn't read the swap in a
reasonable amount of time because of all the other disk activity, but
I'm posting it anyways, since you guys will know if it has something to
do with this or not.


ps. The system is back to stable now after those allocation errors.
-- 
-Jonas

Name:   Jonas Lundgren
ICQ#:   52064961
Mail:   jonas@xxxxxxxx
IRC:    neon / neonman @ EFnet, Undernet, Quakenet, freenode
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux