Re: Postgresql - performance of using array in big database

Ondrej Ivanič <ondrej.ivanic@xxxxxxxxx> · Thu, 9 Aug 2012 10:18:10 +1000

Hi,

On 3 August 2012 19:14,  <roberthanco@xxxxx> wrote:
> I want to add to table "Item" a column "a_elements" (array type of big
> integers) Every record would have not more than 50-60 elements in this
> column.
> After that i would create index GIN on this column and typical select should
> look like this:
> select*from item where......and5<@ a_elements;

I would use this.

> I have also second, more classical, option.
> Do not add column a_elements to table item but create table elements with
> two columns:
>
> id_item
> id_element
>
> This table would have around 200 mln records.
> I am able to do partitioning on this tables so number of records would
> reduce to 20 mln in table elements and 500 K in table item.

I do not understand how you can 'reduce to 20 mln'. Do you mean per partition?

> The second option select looks like this:
> select item.*
> from item
>     leftjoin elements on(item.id_item=elements.id_item)
> where....
> and5= elements.id_element
> I wonder what option would be better in performance point of view. Is
> postgres able to use many different indexes with index GIN (option 1) in a
> single query ?

Assuming that you partition your tables using id_item. Postgres is not
good with partitions if joins are used. Let's have a query:
select .. from item
left join elements on (item.id_item=elements.id_item)
where id_item = 2

needs to scan all partitions in 'elements' table because planner is
not smart enough to push where condition to join clause i.e. rewrite
query like this (8.4, haven't checked in 9.x releases):

select .. from item
left join elements on (item.id_item=elements.id_item and elements.id_item = 2)
where id_item = 2

In order to use partitioning effectively all you queries need to have
constant expression (id_item = 2) in where/join on columns which are
used for partitioning

-- 
Ondrej Ivanic
(ondrej.ivanic@xxxxxxxxx)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance