Tejun Heo wrote: > Robert Hancock wrote: > >> Tejun Heo wrote: >> >>> Stefan wrote: >>> >>>> Hi folks, >>>> >>>> yesterday I upgraded kernel 2.6.19 to 2.6.20 (gentoo kernel). Now my >>>> box locks up about 10 min after boot. >>>> After that I tested with a vanilla 2.6.21.1 it shows the same behavior. >>>> I'm attaching a kern log file from the 2.6.20. The 2.6.21.1 locked up so >>>> hard, that there was no trace left in the log file. >>>> If I switch back to 2.6.19 everything is fine again. >>>> If necessary I will hook up a laptop to this box, so I can capture >>>> messages via netconsole. >>>> >>> Yes please. >>> Okay, I had time to set this up. I'm attaching the log messages I got via netconsole. >>> >>>> This machine is running an AMD X2 64, NFORCE4 (ASUS A8N-E) >>>> >>>> I haven't been following the development of libata for a while, but from >>>> the 2.6.20 changelog It looks like there have been some major changes. >>>> >>>> Just let me know what kind of information you need in order to narrow it >>>> down. >>>> >>> Does giving 'sata_nv.adma=0' kernel parameter make any difference? >>> I tested about 20h with adma disabled, the crash won't occur. If I remove sata_nv.adma=0 from boot options again it doesn't take long until my machine locks up. [Attached dmesg output with 2.6.21.1 kernel + crash info I got via netconsole] I hope this is useful to you guys. Cheers Stefan >> If adma=0 words, then that means it's either an ADMA related problem or >> that SAMSUNG HD401LJ drive has some problems with NCQ (since ADMA off >> means NCQ off as well). I would say the latter is more likely. >> > > I don't have first hand experience with the particular model but I'll be > surprised if they screwed their firmware up with new generation of > harddisks. Firmware on the previous generation drives was pretty good > and they don't get worse usually. > > >> We should really have some kind of "noncq" kernel parameter we can use >> to help debugging these problems. Though, later kernels are supposed to >> switch it off automatically after too many errors.. >> > > Till now there hasn't been any case where a broken NCQ prevented a > machine from booting but, yeah, having such thing would be nice for > debugging. > >
Attachment:
bootlog.txt.gz
Description: Unix tar archive
Attachment:
crashtrace.txt.gz
Description: Unix tar archive