Search Postgresql Archives

Re: PGSQL or other DB?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Jan 30, 2009, at 11:37 AM, durumdara wrote:

The main viewpoints:
- quick (re)connect - because mod_python basically not store the database connections persistently

mod_python is not a database adaptor. Put another way, mod_python does not make database connections. If you're using Python then to connect to Postgres you'll most likely be using psycopg2 to connect to and query your database. I'm not sure what your previous database experience is but it sounds like you may be thinking in terms of PHP's persistent connections and on that I'll just say that PHP's persistent connections are not true persistent connections from your top-level application's perspective. If you want connection pooling then there are options available, pgpool-II and pgbouncer are probably the two most commonly used ones today.

 - fast queries

Stop and think about that one for a moment. Do you really expect to go to a specific database's mailing list(s) with that requirement expecting anything other than "We got you covered!"

 - easy IDE to use (like pgadmin)

I think you just answered your own question.

- the db server can handle many databases (webdb_nnn where nnn is an integer)

I know people hate hearing this answer but that's the wrong question. What matters isn't how many databases can be run (and, if you ever run into a database server in this day and age where it does, run) but rather how many concurrent connections you expect as well as both the total data set size and how much of that data set will be constantly worked with, i.e. needs to be in memory. Let's say you have 10 5G databases with each averaging 10% of their total size in use (in memory, hopefully) at any given moment. That's entirely different from 10 10G databases with each averaging 10% in memory and both are different from running 200 databases each with 5% in use at any given moment. All of that being said, a properly configured Postgres provisioned for the workload installation built with a little knowledge will scale with the best of 'em.

- I can add/modify a table, or a field to a table without "full lock" on the table (like DBISAM restructure). Like in FireBird, where the "add field" change only the table description. I don't know that PG supports this way of the DB modifying.

Nope. PostgreSQL is an all or nothing transactional database. I'd never heard of DBISAM before you mentioned it and have never used Firebird. After doing a little reading it turns out that if you plan to use transactions at all (which is very likely given even just the little you've described about the applications you're building) then you should realize that altering tables is not compatible with transactions and doing so will automatically commit any open transactions on the table. Looking into Firebird I couldn't find how it handles (or doesn't) that at all I but I did see that it will happily let you add a new not null column with no default to a table by writing nulls for the new attribute for any existing columns. That already makes me queasy.

 - Quick and easy backup/restore system

Well, again, that depends. I believe pgAdmin has some nice menu options to handle simple cases (others can either add to or correct me there) but as to the core Postgres distribution there's still a bit of ambiguity arising from cases of "there's more than one way to do it" even for basic tasks as well as some inconsistent UI aspects to the existing tools (see the pg_dump/pg_restore/psql comments in the recent Pet Peeves thread).

Another important thing that I don't understand (what as I saw) that the PostGreSQL is store the databases in one, unseparatable file set, in a directory named data. In another databases, like DBISAM, FireBird, MySQL, the databases are separated to another directories/files.

This "one datadir" is seems to be not too good for us. We used DBISAM in our clients, and many times when we got some filesystem error, we can simply recover the tables - from the files.

Way off. While I didn't look into the storage layouts for DBISAM or Firebird I can tell you that in any MySQL server instance all of the tables will be under (to some depth) the same top level data directory directory with each database in each own directory under that and table files for each database under those. I think you may be able to use symlinks MyISAM table files but InnoDB tables will croak if you try to do that. What's more, by default InnoDB does *not* use separate files per table and instead stores all InnoDB tables in one file with it's own internal file system. Admittedly, I think most people turn on the innodb_file_per_table option such that each table will get it's own data and index files but even then all of a given table's data indexes will reside in the same file. Each of those InnoDB table data+index files is referred to as an InnoDB tablespace (but that's really different from what most people expect a tablespace to be, see next paragraph). One consequence of that is that adding an index to a table requires that the entire table and all of it's indexes be rewritten.

PostgreSQL also follows the 1 directory per-database setup. Each database directory is located directly underneath the base/ direcotry in the main Postgres One big difference is that table data and index data are in separate files and multiple indexes for a given table are all in separate files. This can be viewed as the low-level embodiment of the fact that in PostgreSQL indexes are not part of the tables they are for but rather separate, but tightly coupled, database objects. With this scheme adding a new index, which doesn't even require the table to be locked any more for new indexes, only needs to write the new index file, not rewrite the table data and previously existing indexes as well. What's more, Postgres allows you to create real tablespaces so that you can place individual persistent database objects (databases, tables, indexes, and some constraints) on separate storage.


When we want to backup or restore one database, we can do it in two way: a.) archive all files b.) make sql dump from database.

Both options are available with Postgres although, as has been noted by others, option 'a' requires other tools in order to back up a consistent data set.

If a file get corrupted in a database, then we can restore the datas from files, and this filesystem error causes problems only for this database, not for all.

I very fear from to keep all databases in one place, because if they are corrupted, possible more of them injured (if they are not separated). I cannot make filesystem based (hard) copy from one db (only SQL dump enabled).

Again, I'm not sure what you mean. There are different kinds of corruption that can be found ion files. Database files could be considered corrupt if the database screws something up and writes bad data to a file. That would be localized to the files it wrote to in the bad operation (and is extremely rare with Postgres) whereas what I think of as filesystem corruption often doesn't see file boundaries.

Ok, I saw that pgsql supports "tablespaces", but as I saw, this function can hold only table datas in the another directory, and not the full database can separated with them.

No, you can put entire databases in/on different tablespaces.

Because I don't used one PGSQL with many databases (up to 200), I don't know, what happening, and which cases possible. But I think you have many experience with it. Please share it with me!

Please help me, because we need to determine which DB to use.

I started the usage of the PG in prev. month, and I liked it except the way of the data storage (one data dir).

I tried the MySQL before I tried PG. InnoDB is seems to be "forcing transaction system on MyISAM". And boolean data type is missing (solved with enum?). I don't like it all, but it is seems to be fast with little tables, and it is separate the database files to another directories which thing I like. Possible it have many limitations what I don't saw in first time.

Well, if you don't want transaction then stay away from Postgres and realize that Postgres actually gives you more built-in support for placing your data at different locations than MySQL.

Please help me, which DB is good for us, and how to configure, and use PGSQL with these database-set which we need to use.

Just realize that when choosing a complex technology solution like a database there really is no "yes" or "no" answer or "here's the right way to do it" and "here's the wrong way". You need to define your requirements and part of that is asking questions. For example, 'how to configure..' -- if you seriously ask a question like that here you're going to get more questions (what kind of IO subsystem? how many concurrent connections do you need to support? how much memory do you have? how large is our data set? etc...) than you will answers.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k






--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux