Re: PostgreSQL as a local in-memory cache

Dave Crooke <dcrooke@xxxxxxxxx> · Wed, 30 Jun 2010 11:42:50 -0500

I haven't jumped in yet on this thread, but here goes ....

If you're really looking for query performance, then any database which is designed with reliability and ACID consistency in mind is going to inherently have some mis-fit features.

Some other ideas to consider, depending on your query mix:

1. MySQL with the MyISAM database (non-ACID)

2. Put an in-application generic query cache in front of the DB, that runs in the app address space, e.g. Cache' if using Java

3. Using a DB is a good way to get generic querying capability, but if the "where" clause in the querying is over a small set of meta-data, and SQL syntax is not a big requirement, consider non-RDBMS alternatives, e.g. use XPath over a W3C DOM object tree to get primary keys to in-memory hash tables (possibly distributed with something like memcached)

On Mon, Jun 14, 2010 at 9:14 PM, jgardner@xxxxxxxxxxxxxxxxxxx <jgardner@xxxxxxxxxxxxxxxxxxx> wrote:

We have a fairly unique need for a local, in-memory cache. This will

store data aggregated from other sources. Generating the data only

takes a few minutes, and it is updated often. There will be some

fairly expensive queries of arbitrary complexity run at a fairly high

rate. We're looking for high concurrency and reasonable performance

throughout.

The entire data set is roughly 20 MB in size. We've tried Carbonado in

front of SleepycatJE only to discover that it chokes at a fairly low

concurrency and that Carbonado's rule-based optimizer is wholly

insufficient for our needs. We've also tried Carbonado's Map

Repository which suffers the same problems.

I've since moved the backend database to a local PostgreSQL instance

hoping to take advantage of PostgreSQL's superior performance at high

concurrency. Of course, at the default settings, it performs quite

poorly compares to the Map Repository and Sleepycat JE.

My question is how can I configure the database to run as quickly as

possible if I don't care about data consistency or durability? That

is, the data is updated so often and it can be reproduced fairly

rapidly so that if there is a server crash or random particles from

space mess up memory we'd just restart the machine and move on.

I've never configured PostgreSQL to work like this and I thought maybe

someone here had some ideas on a good approach to this.

--

Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)

To make changes to your subscription:

http://www.postgresql.org/mailpref/pgsql-performance