Search Postgresql Archives

Re: Index only scan

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 10 October 2012 23:03, Lars Helge Øverland <larshelge@xxxxxxxxx> wrote:
> We are now in the process of designing a new component for analytics
> and this feature got me thinking we could utilize postgres over other
> alternatives like column-oriented databases. Basically we will have a
> wide, denormalized table with 20+ columns with relatively low
> cardinality. Typically we will have queries which sums a fact column
> based on where/group by clauses on several dimension columns (standard
> data warehouse stuff). An example would be "select a, b, c, sum(d)
> from analytics where a=1 and b=2 group by a,b,c";
>
> Finally, is there anyone else who are using postgres for this purpose
> and have some good tips to share in order to achieve good performance,
> including index strategies, beyond the standard config best practices?

yes, we had fact table which has around 250 columns and 250mil rows.
The question is if you can partition your data set. For example,
monthly partition. This keeps indexes small but all queries must be
constrained by the same column as is used for partitioning (ie.
monthly partitions -> every query should have "datetime between ...
and ...")

>From my experience postgres is not good with large group by queries.
For example, your query:
select a, b, c, sum(d) from analytics where a=1 and b=2 group by a,b,c

could be executed over multiple connections:
insert into t select select a, b, c, sum(d) as d from analytics where
c >= val1 and c < val2 and a=1 and b=2 group by a,b,c
insert into t select select a, b, c, sum(d) as d from analytics where
c >= val2 and c < val3 and a=1 and b=2 group by a,b,c
...
insert into t select select a, b, c, sum(d) as d from analytics where
c >= valN-1 and c < valN and a=1 and b=2 group by a,b,c

and then get the final result:
select a, b, c, sum(d) from t group by a,b,c

You can use pgpool-II parallel query feature instead of manual slicing.

-- 
Ondrej Ivanic
(ondrej.ivanic@xxxxxxxxx)
(http://www.linkedin.com/in/ondrejivanic)


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux