Re: how to improve perf of 131MM row table?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I sent the details as identified by pgAdmin III.

psql output shows this:
\d alf_node
               Table "public.alf_node"
     Column     |          Type          | Modifiers
----------------+------------------------+-----------
 id             | bigint                 | not null
 version        | bigint                 | not null
 store_id       | bigint                 | not null
 uuid           | character varying(36)  | not null
 transaction_id | bigint                 | not null
 node_deleted   | boolean                | not null
 type_qname_id  | bigint                 | not null
 locale_id      | bigint                 | not null
 acl_id         | bigint                 |
 audit_creator  | character varying(255) |
 audit_created  | character varying(30)  |
 audit_modifier | character varying(255) |
 audit_modified | character varying(30)  |
 audit_accessed | character varying(30)  |
Indexes:
    "alf_node_pkey" PRIMARY KEY, btree (id) CLUSTER
    "store_id" UNIQUE, btree (store_id, uuid)
    "fk_alf_node_acl" btree (acl_id)
    "fk_alf_node_loc" btree (locale_id)
    "fk_alf_node_store" btree (store_id)
    "fk_alf_node_tqn" btree (type_qname_id)
    "fk_alf_node_txn" btree (transaction_id)
    "idx_alf_node_del" btree (node_deleted)
    "idx_alf_node_txn_del" btree (transaction_id, node_deleted)
Foreign-key constraints:
"fk_alf_node_acl" FOREIGN KEY (acl_id) REFERENCES alf_access_control_list(id)
    "fk_alf_node_loc" FOREIGN KEY (locale_id) REFERENCES alf_locale(id)
    "fk_alf_node_store" FOREIGN KEY (store_id) REFERENCES alf_store(id)
    "fk_alf_node_tqn" FOREIGN KEY (type_qname_id) REFERENCES alf_qname(id)
"fk_alf_node_txn" FOREIGN KEY (transaction_id) REFERENCES alf_transaction(id)
Referenced by:
TABLE "alf_child_assoc" CONSTRAINT "fk_alf_cass_cnode" FOREIGN KEY (child_node_id) REFERENCES alf_node(id) TABLE "alf_child_assoc" CONSTRAINT "fk_alf_cass_pnode" FOREIGN KEY (parent_node_id) REFERENCES alf_node(id) TABLE "alf_node_aspects" CONSTRAINT "fk_alf_nasp_n" FOREIGN KEY (node_id) REFERENCES alf_node(id) TABLE "alf_node_assoc" CONSTRAINT "fk_alf_nass_snode" FOREIGN KEY (source_node_id) REFERENCES alf_node(id) TABLE "alf_node_assoc" CONSTRAINT "fk_alf_nass_tnode" FOREIGN KEY (target_node_id) REFERENCES alf_node(id) TABLE "alf_node_properties" CONSTRAINT "fk_alf_nprop_n" FOREIGN KEY (node_id) REFERENCES alf_node(id) TABLE "alf_store" CONSTRAINT "fk_alf_store_root" FOREIGN KEY (root_node_id) REFERENCES alf_node(id) TABLE "alf_subscriptions" CONSTRAINT "fk_alf_sub_node" FOREIGN KEY (node_id) REFERENCES alf_node(id) ON DELETE CASCADE TABLE "alf_subscriptions" CONSTRAINT "fk_alf_sub_user" FOREIGN KEY (user_node_id) REFERENCES alf_node(id) ON DELETE CASCADE TABLE "alf_usage_delta" CONSTRAINT "fk_alf_usaged_n" FOREIGN KEY (node_id) REFERENCES alf_node(id)

This line of the output:
    "alf_node_pkey" PRIMARY KEY, btree (id) CLUSTER
would indicate to me that there is a PK on alf_node table, it is on column "id", it is of type btree, and the table is clustered around that index.

Am I reading this totally wrong?

The supporting table actually seems to have a multi-column PK defined, and a separate btree index on node_id as you mentioned.

-AJ


On 6/26/2014 10:05 AM, Shaun Thomas wrote:
On 06/26/2014 08:26 AM, AJ Weber wrote:

The "master table" definition is attached as "table1.sql".
The "detail table" definition is attached as "table2.sql".

I'm not sure what you think a primary key is, but neither of these tables have one. Primary keys are declared one of two ways:

CREATE TABLE foo
(
  id    BIGINT PRIMARY KEY,
  col1  VARCHAR,
  col2  INT
);

Or this:

CREATE TABLE foo
(
  id    BIGINT,
  col1  VARCHAR,
  col2  INT
);

ALTER TABLE foo ADD constraint pk_foo PRIMARY KEY (id);

On your alf_node_properties table, you only have an index on node_id because you created one. If you look at your alf_node table, there is no index on the id column at all. This is confirmed by the explain output you attached:

Seq Scan on alf_node node (cost=0.00..227265.29 rows=5733429 width=16) (actual time=0.013..2029.649 rows=5733888 loops=1)

Since it has no index, the database is reading the entire table to find your matching values. Then it's using the index on node_id in the other table to find the 'detail' matches, as seen here:

Bitmap Index Scan on fk_alf_nprop_n (cost=0.00..1240.00 rows=52790 width=0) (actual time=0.552..0.552 rows=1071 loops=1)

Add an actual primary key to your alf_node table, and your query performance should improve substantially. But I also strongly suggest you spend some time learning how to read an EXPLAIN plan, as that would have made your problem obvious immediately.

Here's a link for your version:

http://www.postgresql.org/docs/9.0/static/sql-explain.html

You should still consider upgrading to the latest release of 9.0 too.





[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux