Re: understanding postgres issues/bottlenecks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 15 Jan 2009, Jean-David Beyer wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

M. Edward (Ed) Borasky wrote:
| Luke Lonergan wrote:
|> Not to mention the #1 cause of server faults in my experience: OS
|> kernel bug causes a crash.  Battery backup doesn't help you much there.
|>
|
| Well now ... that very much depends on where you *got* the server OS and
| how you administer it. If you're talking a correctly-maintained Windows
| 2003 Server installation, or a correctly-maintained Red Hat Enterprise
| Linux installation, or any other "branded" OS from Novell, Sun, HP, etc.,
| I'm guessing such crashes are much rarer than what you've experienced.
|
| And you're probably in pretty good shape with Debian stable and the RHEL
| respins like CentOS. I can't comment on Ubuntu server or any of the BSD
| family -- I've never worked with them. But you should be able to keep a
| "branded" server up for months, with the exception of applying security
| patches that require a reboot. And *those* can be *planned* outages!
|
| Where you *will* have some major OS risk is with testing-level software
| or "bleeding edge" Linux distros like Fedora. Quite frankly, I don't know
| why people run Fedora servers -- if it's Red Hat compatibility you want,
| there's CentOS.
|
Linux kernels seem to be pretty good these days. I ran Red Hat Linux 7.3
24/7 for over 6 months, and it was discontinued years ago. I recognize that
this is by no means a record. It did not crash after 6 months, but I
upgraded that box to CentOS 4 and it has been running that a long time. That
box has minor hardware problems that do not happen often enough to find the
real cause. But it stays up months at a time. All that box does is run BOINC
and a printer server (CUPS).

This machine does not crash, but it gets rebooted whenever a new kernel
comes out, and has been up almost a month. It run RHEL5.

I would think Fedora's kernel would probably be OK, but the other bleeding
edge stuff I would not risk a serious server on.

I have been running kernel.org kernels in production for about 12 years now (on what has now grown to a couple hundred servers), and I routinely run from upgrade to upgrade with no crashes. I tend to upgrade every year or so).

that being said, things happen. I have a set of firewalls running the Checkpoint Secure Platform linux distribution that locked up solidly a couple weeks after putting them in place (the iptables firewalls that they replaced had been humming along just fine under much heavier loads for months).

the more mainstream your hardware is the safer you are (unfortunantly very few RAID cards are mainstream), but I've also found that by compiling a minimal kernel that only supports the stuff that I need also contributes to reliability.

but even with my experiance, I would never architect anything with the expectation that system crashes don't happen. I actually see more crashes due to overheating (fans fail, AC units fail, etc) than I do from kernel crashes.

not everything needs reliability. I am getting ready to build a pair of postgres servers that will have all safety disabled. I will get the redundancy I need by replicating between the pair, and if they both go down (datacenter outage) it is very appropriate to loose the entire contents of the system and reinitialize from scratch (in fact, every boot of the system will do this)

but you need to think carefully about what you are doing when you disable the protection.

David Lang

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux