>>>>> "Theodore" == Theodore Tso <tytso@xxxxxxx> writes: Oops. It looks like 2.6.29.3 is actually quite solid. My fault, I must have gotten confused. I know that 2.6.30-rc* was unstable on there and locked up easily. Theodore> On Tue, May 19, 2009 at 02:27:14PM -0400, John Stoffel wrote: >> I wonder if this is the reason my main file server has been locking up >> solid under 2.6.29 or newer kernels lately, but 2.6.28 is rock solid. >> Since it's my main file server at home, and with my home dir NFS >> mounted from it onto another system, it's been hard to catch. I spent >> some time fiddling around getting netconsole setup, but then I ran out >> of time. Theodore> Unless you have your partition mounted with the "sync" mount Theodore> option (which has negative performance implifications; it Theodore> makes sense for a mail queue directory, but not necessarily Theodore> for a general purpose file server) or you have a directory Theodore> chattr'ed with the sync flag, probably not... Theodore> If you want to try it, though, the patch is available here: Theodore> http://bugzilla.kernel.org/attachment.cgi?id=21436 Ok, then it's probably not something I need to test, since I'm only mounting stuff noatime. >> If someone could send me the patch, I'll apply it and see how well >> 2.6.29.[34] works, and whether or not 2.6.30-rcN works as well. >> Reproducing the problem was pretty easy for me. Theodore> Anything on the console? Any oops messages, or soft lockup warnings? Nothing. I've not had the time lately to reboot the system to try 2.6.29 or newer with all the lockup debugging stuff yet. Maybe tonight I'll get a chance. Theodore> What filesystem(s) are you using? ext3 for everything, except one staging area running ext4 which is only used for bacula to stage data before writing to tape. It's solid under 2.6.29.3 (dammit, I must have mis-remembered) and it's been up now for six days running backups and serving NFS files. Here's my filesystems: > mount /dev/sda2 on / type ext3 (rw,errors=remount-ro) tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755) proc on /proc type proc (rw,noexec,nosuid,nodev) sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) procbususb on /proc/bus/usb type usbfs (rw) /udev on /dev type tmpfs (rw,mode=0755) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620) fusectl on /sys/fs/fuse/connections type fusectl (rw) /dev/sda5 on /var type ext3 (rw,noatime) /dev/sda1 on /boot type ext3 (rw,noatime) /dev/sda6 on /usr type ext3 (rw,noatime) /dev/dm-1 on /home type ext3 (rw,noatime) /dev/dm-2 on /local type ext3 (rw,noatime) overflow on /tmp type tmpfs (rw,size=1048576,mode=1777,size=50%) rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) nfsd on /proc/fs/nfsd type nfsd (rw) binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noexec,nosuid,nodev) /dev/mapper/onetwenty-staging on /staging type ext4 (rw,noatime) When the system locks up, there's nothing in the logs, nothing on the screen, even when I leave it turned to VT1 (Ctl-Alt-F1) and then wait for a lockup, the screen is completely blank. I'll see about finding some more time to beat on this and get better results back to people. John -- To unsubscribe from this list: send the line "unsubscribe kernel-testers" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html