From: pgsql-general-owner@
postgresql.org [mailto:pgsql-general-owner@postgresql.org ] On Behalf Of Pavel StehuleI guess my question then is: how much do you pay for that durability? If you benchmark Postgres configured for pure in-memory usage with absolutely no writes to disk (or SSD or network), where is it spending its time? Is there a lot of overhead in getting data in and out of cache buffers and conversions and in concurrency control?
It is not about durability only.
Postgres holds data in format equal or similar to saved data on persistent storage. There are repeated serialization and deserialization. Some structures are designed to be simply saved (like Btree), but the performance is second target.
I believe so new memory databases can be 10-100x faster - depends on use case, because they hold data primary in memory and uses different data structures. The performance of these databases is great, when all data are well placed in memory all time. But the performance is pretty bad, when this rule is not true. There is another issue - when you increase speed of database write operations, probably you will hit a file system limits, spin lock issues - so it is one reason, why big system are based on distributed systems more and more.
That’s the point I’m making, exactly. The question is: does anyone have a handle on how big that cost really is, as a guide to whether to try to do anything about it? Is it really 25x as Stonebraker says?
I did some benchmarks of MonetDB and it is really pretty fast for OLAP.
You can try to implement TPC-B benchmark in C (without SQL - the cost of SQL is not significant), and you can check it.