Hi, On 26 September 2012 21:50, Ryan Kelly <rpkelly22@xxxxxxxxx> wrote: > The size of our database is growing rather rapidly. We're concerned > about how well Postgres will scale for OLAP-style queries over terabytes > of data. Googling around doesn't yield great results for vanilla > Postgres in this application, but generally links to other software like > Greenplum, Netezza, and Aster Data (some of which are based off of > Postgres). Too, there are solutions like Stado. But I'm concerned about > the amount of effort to use such solutions and what we would have to > give up feature-wise. We were able to have 2TB of data (16 Cores, 96GB RAM, FC connected SAN). Performance was good: - up to 5sec for simple select with multiple where conditions (2 - 25) order by any column (of 400 columns) and return top 10k - up to 15sec for executing 5 queries (simultaneously) which return top 20 combination of any two columns - up to 25sec for executing 56 queries (using materialised aggregate tables) - aggregation job ran every 15 minutes and completed under 2 minutes: 5mil rows -> aggregation -> 56 tables - all queries can be executed over date range up to several months (monthly partitioned tables, 6 months history) but it was very hard to scale this solution. We have tried: - FusionIO cards: 10 to 100 times better performance, but very hard to expand storage capacity; Cooling/power issues - AsterData: nice SQL-MR feature and analytics (decision trees, frequent items, clustering, ...); No libpq support and you have to use JDBC or selected ODBC manager - Greenplum (winer): performance comparable to FusionIO (10 to 50 times); we were able to remove aggregation job (because of columnar store model); easy to port from postgres but could be complicated if you are heavy pgpsql user At this time I would try: - Postgres-XC - Stado - Cassandra + Solr + Hadoop (for example DataStax Enterprise) - Birst (http://www.birst.com/) PS. For any commercial solution is better to buy their appliance (if you can afford it...). -- Ondrej Ivanic (ondrej.ivanic@xxxxxxxxx) -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general