All,
So, I've been discussing this because using PostgreSQL on the caching
layer has become more common that I think most people realize. Jonathan
is one of 4 companies I know of who are doing this, and with the growth
of Hadoop and other large-scale data-processing technologies, I think
demand will increase.
Especially as, in repeated tests, PostgreSQL with persistence turned off
is just as fast as the fastest nondurable NoSQL database. And it has a
LOT more features.
Now, while fsync=off and tmpfs for WAL more-or-less eliminate the IO for
durability, they don't eliminate the CPU time. Which means that a
caching version of PostgreSQL could be even faster. To do that, we'd
need to:
a) Eliminate WAL logging entirely
b) Eliminate checkpointing
c) Turn off the background writer
d) Have PostgreSQL refuse to restart after a crash and instead call an
exteral script (for reprovisioning)
Of the three above, (a) is the most difficult codewise. (b)(c) and (d)
should be relatively straightforwards, although I believe that we now
have the bgwriter doing some other essential work besides syncing
buffers. There's also a narrower use-case in eliminating (a), since a
non-fsync'd server which was recording WAL could be used as part of a
replication chain.
This isn't on hackers because I'm not ready to start working on a patch,
but I'd like some feedback on the complexities of doing (b) and (c) as
well as how many people could use a non-persistant, in-memory postgres.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance