Re: query a table with lots of coulmns

Josh Berkus <josh@xxxxxxxxxxxx> · Fri, 19 Sep 2014 14:40:49 -0700

On 09/19/2014 04:51 AM, Björn Wittich wrote:
> 
> I am relatively new to postgres. I have a table with 500 coulmns and
> about 40 mio rows. I call this cache table where one column is a unique
> key (indexed) and the 499 columns (type integer) are some values
> belonging to this key.
> 
> Now I have a second (temporary) table (only 2 columns one is the key of
> my cache table) and I want  do an inner join between my temporary table
> and the large cache table and export all matching rows. I found out,
> that the performance increases when I limit the join to lots of small
> parts.
> But it seems that the databases needs a lot of disk io to gather all 499
> data columns.
> Is there a possibilty to tell the databases that all these colums are
> always treated as tuples and I always want to get the whole row? Perhaps
> the disk oraganization could then be optimized?

PostgreSQL is already a row store, which means by default you're getting
all of the columns, and the columns are stored physically adjacent to
each other.

If requesting only 1 or two columns is faster than requesting all of
them, that's pretty much certainly due to transmission time, not disk
IO.  Otherwise, please post your schema (well, a truncated version) and
your queries.

BTW, in cases like yours I've used a INT array instead of 500 columns to
good effect; it works slightly better with PostgreSQL's compression.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance