[Bug 13982] New: [libata] (?) causing Hardlock in 2.6.30.4 during simultaneous read & write

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



http://bugzilla.kernel.org/show_bug.cgi?id=13982

           Summary: [libata] (?) causing Hardlock in 2.6.30.4 during
                    simultaneous read & write
           Product: IO/Storage
           Version: 2.5
    Kernel Version: 2.6.30.4
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: SCSI
        AssignedTo: linux-scsi@xxxxxxxxxxxxxxx
        ReportedBy: wylda@xxxxxxxx
        Regression: No


Created an attachment (id=22713)
 --> (http://bugzilla.kernel.org/attachment.cgi?id=22713)
kernel config

Hi.

HW: Server Board Intel STL2, 2x P3 @ 1GHz, 1GB ECC RAM

SW: self-compiled kernel 2.6.30.4 on Debian Lenny

Symptom: PC completely stops responding (ping, ALT+F2..., Numlock,
CTRL-ALT-DEL, ALT-SysRq)

Traces: No Oops, nothing in syslog etc.


I think it's not HW failure, because it never happened when

 * 2x dd if=/dev/zero bs=1M count=200000 | md5sum -b

 * 2x dd if=/dev/zero of=test-x bs=1M count=200000

such tests take a long time on this HW (51min and 85min) and checksums always
OK. Tested many times.



Anyway i'm usually able to invoke Hardlock in 2min. I use a script:

#!/bin/bash

dd if=/dev/zero bs=1M count=200000 | md5sum -b &
dd if=/dev/zero bs=1M count=200000 | md5sum -b &
cd /home/pik/a
md5sum -c office.md5 &
cd /home/pik/b
md5sum -c office.md5 &

So i run this stress script _and_ begin FTP write to the same HDD. Usually
Hardlock itself, but if it does not Hardlock in 60sec i can help it with
another dd (dd if=/dev/zero of=test1 bs=1M count=200000).


Also why should not be HW failure - No complains of EDAC and happens on
different HW:

 * PATA drive IC35L040AVVA07 on ServerWorks OSB4 (MOBO's chipset aka IB6566
South Bridge)

 * SATA drives 2xWD5000AADS in md0 on Sil3114

 * Network card: PCI-X, Intel 1Gbps 82543GC

 * Network card: PCI Realtek RT8139


Today when doing last test for bugreport there was a trace, but the HardLock
was not 100% same (as always ping stopped working, console switching did not
work, no Numlock reaction, but Alt-SysRq worked). Hope its not misleading - see
attachment.


Another prove(?), that this is not HW failure:
  * never happens with Debian's 2.6.26-17lenny1 all_generic_ide=1 gcc4.1.3
  * easy to trigger with 2.6.30.4 gcc4.3.2

...i know know different kernel version, kernel parameters and gcc, but HW
error would occurred anyway.


config kernel, dmesg, lspci atached.

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux