Search Postgresql Archives

Re: young guy wanting (Postgres DBA) ammo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2 Nov 2007, Kevin Hunter wrote:

I don't have "ammo" to defend (or agree?) with my friend when he says that "Postgres requires a DBA and MySQL doesn't so that's why they choose the latter."

A statement like this suggests a fundamental misunderstanding of what a DBA does, and unfortunately for you that means you're stuck with educating them as to why they don't even understand the concept--which is particularly tough when you're not a DBA yourself.

The job of a DBA is to make sure the data you're storing in the database is safe and that the system as a whole performs fast enough to keep up with demand. If your data is so trivial that it doesn't really matter whether the data stays intact or gets corrupted, and there are no performance requirements to meet, then you don't need someone operating as a DBA; in every other case, you do.

It's simple to setup MySQL with the default configuration running such trivial workloads, giving the impression you've built a system that works fine. There are a number of ways this default setup can end up with corrupted data one day. As mentioned in the paper you've already read, it's possible to setup recent MySQL versions to run in the new strict modes with the right type of engine such that it has reasonable standards for data integrity. Actually doing that work _correctly_ will require a DBA, but since it's possible not to do it at all and have things appear to work, many people walk away thinking they didn't need someone acting in that role at all.

PostgreSQL defaults to high standards for data integrity and as a result you can't avoid being exposed to some amount of fighting with the inevitable ramifications of that. An example already thrown out here is that you must do some amount of initially frustrating configuration in order to even get users to login the way people expect. Another one on the performance side is that you'll be forced to understand the trade-offs in how vacuuming works in PostgreSQL in order to keep your system running acceptably. It's not possible to run a secure database on a larger scale without going through these sort of exercises. But if you don't care about security and never reach a large scale, you could get the impression that this work was a waste of time, and that the database that forced you to go through it was unreasonably difficult to setup without a DBA.

To step back for a second, the software industry as a whole is going through this phase right now where programmers are more empowered than ever to run complicated database-driven designs without actually having to be DBAs. It used to be that you "needed a DBA" for every job like this because they were the only people who knew how to setup the database tables at all, and once they were involved they also (if they were any good) did higher-level design planning, with scalabilty in mind, and worried about data integrity issues.

Software frameworks like Ruby on Rails and Hibernate have made it simple for programmers to churn out code that operates on databases without having the slightest idea what is going on under the hood. From a programmer's perspective, the "better" database is the one that requires the least work to get running. This leads to projects where a system that worked fine "in development" crashes and burns once it reaches a non-trivial workload, because if you don't design databases with an eye towards scalability and integrity you don't magically get either.

The sad part is that it's nearly impossible to educate people going through this process what they're doing wrong. Human nature is such that until you've had a day where sloppy setup caused you to lose a gigantic amount of data, spending some time with that sick feeling in your stomach that everyone who has been through this knows, it's hard to ever reach the level of paranoid necessary to be a successful DBA. Until you've fought to try and speed up a database application where data normalization is the only way to solve the fundamental problem causing the slowdown, it's impossible to truly appreciate why you should consider design tradeoffs in that area from day one. Can you build a database without someone who has been through these experiences? Sure. That doesn't mean it's a good idea.

--
* Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
      message can get through to the mailing list cleanly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux