A few weeks ago we had a 4-way amd64 web server running RHEL 4 that
crashed sporadically -- nothing left in the syslog. up2date didn't find
a new kernel, so I just downloaded and installed the latest kernel from
kernel.org and the system has been stable ever since. I'm not sure if I
could have gone to RH for support because Cornell has a site license,
and even if I had a direct line to RH management, it would take me more
time to explain the problem than it would take to try a mainstream kernel.
Overall, I'm quite happy with the four-digit revision mainstream
linux kernels. We had a crash on our main machine that left a stack
trace, did some research on the web, found that this had been fixed in
2.6.11.something, upgraded the kernel, case closed.
People are willing to pay $$ to get an "enterprise" product which is
reliable, and supported, but this is another case where the generic
product turns out to be more reliable than the branded product, and
looking at what's happening with Fedora, I've got a lot of concern that
RH's pursuit of innovation will always lead to a kernel long on gee-whiz
features and short on reliability. Crashes mean I get calls from the
NOC at 4am, and god forbid that my toddler hears the phone ring or me
walking down the stairs, because I'll need to entertain him while
dealing with the crash and for the rest of the morning. Then a week
later I go to netcraft and they say my uptime is seven days and I feel
like a jerk because the whole world knows about my problems.
I think there are two reasons for the RHEL 4 instability: (i) the
quarterly release cycle means that I have to wait for bug fixes -- and
if you're running a non-x86 architecture, it seems like 2.6 is shaking
out bugs at a high rate, and (ii) RH is aggressively pushing new features.
I really don't know what's in RHEL 4 (it would take me more time to
look at the patches than it would to revert to mainstream) but the
activation of 4KSTACKS in Fedora is one of those changes that reduces
reliably.
I've been looking, and I've never found out what benefit that
4KSTACKS has for end users. The kernel team is sensible, so I'm sure
that there are some real benefits, but looking at the problem reports
and at the attitudes of some people on this list, I start to wonder if
it's just a vindicitive attempt to put an end to ndiswrappers. (I'd
really love to see an explanation of the benefits of 4KSTACKS)
The real trouble is that 4KSTACKS problems aren't in kernel modules
per se, but really are in the combination of modules that are running.
Yeah, maybe they can get reiserfs running under 4KSTACKS, but what if
you're running an NFSv4 server with all the whizzy options turned on,
and IPv6 with tunneling and it's a reiserfs filesystem and you're using
LVM and RAID and a particularly funky SCSI driver, what then?
By adopting 4KSTACKS early, Fedora has helped shake out problems
with 4KSTACKS, but when 4KSTACKS becomes the main option in the
mainstream kernel, we'll see people dealing with weird problems that
happen sporadically on certain setups for years to come. We seem to
have one of the worst workloads in the world, and the last thing I need
is more crashes.
--
fedora-devel-list mailing list
fedora-devel-list@xxxxxxxxxx
http://www.redhat.com/mailman/listinfo/fedora-devel-list