On Tue, 2007-01-30 at 20:44 -0800, David Fetter wrote: > On Tue, Jan 30, 2007 at 04:43:14PM -0800, Richard Troy wrote: > > On Tue, 30 Jan 2007, Mark Walker wrote: > > > > > > I don't know. My customers expect 24/7 reliability. They expect > > > to be able to access their info anywhere in the world over a > > > variety of different devices. I can remember times when people > > > would just go home because computer networks were down. I haven't > > > seen that happen in a long time. > > > > ...Back in 1986, Cheryl Healy and I took on running Polaroid's > > corporate systems "24 X 7" - and we worked hard to make it "24 X 7 X > > 365.24". Shortly thereafter - while still working with Cheryl, > > Angel Vila, Chris Boerner and I took on running Bellcore's 800 > > telephone network full time also - our success was measured how few > > minutes/seconds there was any lost business at all on an annual > > basis. (Bellcore was previously known as AT&T Bell Laboratories.) If > > you made an 800 number based call from '86 to '89, the systems I > > managed for Bellcore helped place that call. ... I could go on. I've > > worked in the "always up" community a long time now and have worked > > with/for more corporations in this capacity than nearly anyone you > > might find - mostly very large, well known companies. > > > > My observation is that we have a real shortage of quality operating > > systems today, and what few exist/remain don't enjoy much market > > share because they're not based on Unix, so they're largely missing > > out on the Open Source activity. What may be worse, young people who > > don't know any better are sometimes told/taught not to bother with > > anything over five years old as it's antiquated so they don't ever > > find out that things could be better - and once were. (Example, > > anyone who thinks "man pages" are great has obviously got a very > > limited experience from which to base their opinion!) ... As a > > practical matter today we mostly have a choice of Windows or some > > flavor of unix, neither of which are great. That would be very > > different in my opinion if only Unix didn't have this asenine view > > that the choice between a memory management strategy that kills > > random processes and turning that off and accepting that your system > > hangs is a reasonable choice and that spending a measily % of > > performance in overhead to eliminate the problem is out of the > > question. Asenine, I tell you. > > The OOM killer in Linux is, indeed, asinine. You can shut it off, > though, and systems administrators worth their salt know this and do > it as a matter of routine. If you have some strategy that doesn't > involve those hangs as a consequence, I'm sure you can get an audience > from the Linux kernel people and/or the FreeBSD ones. > I know this is off-topic for this list, but is there a place I can get some details about linux OOM killer, and the conditions that cause this OS hang when you turn off the OOM killer? I'd like to really know what's happening, and also know more about the OS hanging condition that you're talking about. I'd also like to know how safe the "safe" settings really are ( vm.overcommmit_memory=2 and vm.oom-kill=0? ). Right now I'm using FreeBSD (in a large part due to the Linux OOM killer), but I have a different set of problems on FreeBSD. Regards, Jeff Davis