Re: exception Emask 0x0 SAct 0x1 / SErr 0x0 action 0x2 frozen

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From Brian's earlier e-mail:

> I filed this kernel bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=462425


On Mon, 22 Sep 2008, Justin Piszcz wrote:

I could not agree more.

CC'ing the relevant mailing lists to see if someone out there has any idea what more we could do as this has been affecting you (more so than myself, but I would still like to get some sort of resolution as well, as it still happens to me too):

Similar, but not the same issue:

Sep 17 20:20:05 p34 kernel: [1422169.440538] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Sep 17 20:20:05 p34 kernel: [1422169.440549] ata5.00: cmd b0/d8:00:00:4f:c2/00:00:00:00:00/00 tag 0 Sep 17 20:20:05 p34 kernel: [1422169.440551] res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 17 20:20:05 p34 kernel: [1422169.440556] ata5.00: status: { DRDY }
Sep 17 20:20:05 p34 kernel: [1422169.440561] ata5: hard resetting link
Sep 17 20:20:06 p34 kernel: [1422169.744980] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Sep 17 20:20:06 p34 kernel: [1422169.770448] ata5.00: configured for UDMA/133
Sep 17 20:20:06 p34 kernel: [1422169.770461] ata5: EH complete

(2.6.23.3) above

On Mon, 22 Sep 2008, Brian Rademacher wrote:

Works fine...Also works under heavy load with only 4 drives. I could only get it to fail by doing a raid resync with 4 drives, except for the newer kernel, which dies pretty easily..

What is really frustrating about it is that short of the bugzilla bug I submitted, I don't know who would be willing to listen...A lot of the google hits when searching "action 0x2 frozen" are related to a particular CDROM drive, or general hardware failure. I really don't think that is the case here, but I bet most of the kernel people think the same thing, so they have no reason to care...


Sent: Monday, September 22, 2008 7:04 AM
Subject: Re: Hardware RAID


What about if you just 'stress' one drive?

1. dd if=/dev/sda of=/dev/null bs=1M &
Does it do it?
2. Same thing for sdb?

Justin.

On Mon, 22 Sep 2008, Brian Rademacher wrote:

I killed smartd for testing. Other than that, it seems entirely load based. Anything disk intensive (backups, raid resync, a bunch of spam comes in at once, etc.) makes it fail...

Sent: Monday, September 22, 2008 6:29 AM
Subject: Re: Hardware RAID


While the error happens for me as well it does NOT happen with that much consistency, if I were you, I would start testing different kernels and run it in single user mode (or as close to it as you can) to see if you can narrow down what is causing it, also boot knoppix and see if it occurs-- ?

Justin.

On Mon, 22 Sep 2008, Brian Rademacher wrote:

Doesn't look like a very powerful RAID card, so I may pass on it. I don't think it will have the BW to run as fast as the software RAID currently does since it's only a 64bit/66mhz PCI slot...

I hate to do the hardware RAID thing, but this error is killing me:
Sep 21 12:05:19 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen Sep 21 12:32:12 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen Sep 21 12:41:34 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen Sep 21 12:58:22 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen Sep 21 13:11:04 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen Sep 21 13:23:55 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen Sep 21 13:54:23 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen Sep 21 15:15:04 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen Sep 21 15:44:06 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen Sep 21 21:15:12 radfiles kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen

And at this point, I can either regress to a 4 drive RAID and don't update the kernel, or move forward with hardware...

I don't see a fix coming any time soon, but maybe I'll try one of the latest F10 kernels just to see if anything has changed...


----- Original Message ----- From: "Justin Piszcz" Sent: Monday, September 22, 2008 2:05 AM
Subject: Re: Hardware RAID




On Sun, 21 Sep 2008, Brian Rademacher wrote:

The RAID gods must have been thinking about me. My MB has one of these funny slots and supports ZCR, so for the price I'm going to jump ship. I would guess (and hope) this solves the problem, especially since I'll have to reconstruct the entire array...

http://cgi.ebay.com/2113600-R-Adaptec-Serial-ATA-RAID-2025SA-Storage_W0QQitemZ250295938636QQihZ015QQcategoryZ167QQssPageNameZWDVWQQrdZ1QQcmdZViewItem

Hm cool-- let me know how it goes.

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux