Re: large dataset with write vs read clients

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/10/2010 5:35 AM, Mladen Gogala wrote:
I have a logical problem with asynchronous commit. The "commit" command
should instruct the database to make the outcome of the transaction
permanent. The application should wait to see whether the commit was
successful or not. Asynchronous behavior in the commit statement breaks
the ACID rules and should not be used in a RDBMS system. If you don't
need ACID, you may not need RDBMS at all. You may try with MongoDB.
MongoDB is web scale: http://www.youtube.com/watch?v=b2F-DItXtZs

That argument makes little sense to me.

Because you can afford a clearly defined and bounded loosening of the durability guarantee provided by the database, such that you know and accept the possible loss of x seconds of work if your OS crashes or your UPS fails, this means you don't really need durability guarantees at all - let alone all that atomic commit silliness, transaction isolation, or the guarantee of a consistent on-disk state?

Some of the other flavours of non-SQL databases, both those that've been around forever (PICK/UniVerse/etc, Berkeley DB, Cache, etc) and those that're new and fashionable Cassandra, CouchDB, etc, provide some ACID properties anyway. If you don't need/want an SQL interface to your database you don't have to throw out all that other database-y goodness if you haven't been drinking too much of the NoSQL kool-aid.

There *are* situations in which it's necessary to switch to relying on distributed, eventually-consistent databases with non-traditional approaches to data management. It's awfully nice not to have to, though, and can force you to do a lot more wheel reinvention when it comes to querying, analysing and reporting on your data.

FWIW, a common approach in this sort of situation has historically been - accepting that RDBMSs aren't great at continuous fast loading of individual records - to log the records in batches to a flat file, Berkeley DB, etc as a staging point. You periodically rotate that file out and bulk-load its contents into the RDBMS for analysis and reporting. This doesn't have to be every hour - every minute is usually pretty reasonable, and still gives your database a much easier time without forcing you to modify your app to batch inserts into transactions or anything like that.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux