On Sat, Jan 10, 2009 at 01:21:06PM -0500, Mike Snitzer wrote: > > In practice i rarely see bugfixes that were debugged via kdump. Normal > > oops based fixes outnumber kdump based fixes by a ratio of 1:100 or worse > > - and kdump is readily available these days - just nobody configures it. > > So you're telling me RedHat doesn't rely on kdump at enterprise > customer installations? I find that hard to believe. Few enterprise > customers allow defects to be debugged on-site, sometimes collecting a > crash dump is all you can hope for to make progress. I have to > believe you know this fairly well; if not with direct experience then > through your co-workers? Or am I living in Ingo's version of Linux > hell where kdump is actually useful? In my experience, there are very few kernel versions and hardware for which kdump works. I've talked to the people who have to make kdump work, and every 12-18 months, with a new set of enterprise kernels comes out, they have to go and fix kdump so it works again for the set of hardware that they care about, and for the kernel version involved. Part of the problem is one which has infected nearly every single RAS technology out there, from kdump to Systemtap, which is the people who architect and fund these RAS technologies delude themselves into thinking that they only have to worry about making it work for enterprise kernels and enterprise users, and to hell with everyone else --- specifically, kernel developers, which don't matter since they aren't enterprise users. Heck, until July of last year, Systemtap wouldn't even ***compile*** out of the box on a non-enterprise distribution like Ubuntu or Debian. And I still have yet to make kdump work on a Thinkpad, although I've tried. Since pretty much no one uses these RAS technologies except enterprise users, and no one bothers to make it easy for kernel developers, kernel developers have developed alternate mechanisms for debugging the Linux kernel --- and they don't involve using Systemtap or kdump, because in practice, it doesn't work for them at all, or it's too hard to make it work for them. And this becomes a vicious cycle; since no one is bothered to spend time making RAS technologies work for everyday use by kernel developers, bitrot inevitably sets in, and so the RAS developers get no help from other kernel developers, who are busy fixing their own problems via different means; and so the RAS developers hunker down, and spend even more time fixing the bitrot and complaining that no one helps them or takes them seriously, and the problem gets worse and worse and worse --- until now there are people who are busily developing alternatives to Systemtap, just because too many RAS architects and developers and had their priorities wrong, and forgot to focus on every day kernel developers instead of just enterprise users. It's very sad, and it means a lot of investment gets wasted, and work is getting duplicated as a result. Oh, well. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html