Re: Query using SeqScan instead of IndexScan

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jim,

I'm not quite sure what you mean by the correlation of category_id?
The category_id is part of a compound primary key in the category_product
table. The primary key on category_product is (category_id, product_id).

Here's the definitions of the two tables involved in the join:

            Table "public.category_product"
       Column        |         Type         | Modifiers
---------------------+----------------------+-----------
category_id         | integer              | not null
product_id          | integer              | not null
en_name_sort_order  | integer              |
fr_name_sort_order  | integer              |
merchant_sort_order | integer              |
price_sort_order    | integer              |
merchant_count      | integer              |
is_active           | character varying(5) |
Indexes:
"x_category_product_pk" PRIMARY KEY, btree (category_id, product_id)
    "category_product__is_active_idx" btree (is_active)
"category_product__merchant_sort_order_idx" btree (merchant_sort_order) "x_category_product__category_id_fk_idx" btree (category_id) CLUSTER
    "x_category_product__product_id_fk_idx" btree (product_id)
Foreign-key constraints:
"x_category_product_category_fk" FOREIGN KEY (category_id) REFERENCES category(category_id) DEFERRABLE INITIALLY DEFERRED "x_category_product_product_fk" FOREIGN KEY (product_id) REFERENCES product(product_id) DEFERRABLE INITIALLY DEFERRED



             Table "public.product_attribute_value"
           Column           |         Type          | Modifiers
----------------------------+-----------------------+-----------
attribute_id               | integer               | not null
attribute_unit_id          | integer               |
attribute_value_id         | integer               |
boolean_value              | character varying(5)  |
decimal_value              | numeric(30,10)        |
product_attribute_value_id | integer               | not null
product_id                 | integer               | not null
product_reference_id       | integer               |
status_code                | character varying(32) |
Indexes:
"product_attribute_value_pk" PRIMARY KEY, btree (product_attribute_value_id)
    "product_attribute_value__attribute_id_fk_idx" btree (attribute_id)
"product_attribute_value__attribute_unit_id_fk_idx" btree (attribute_unit_id) "product_attribute_value__attribute_value_id_fk_idx" btree (attribute_value_id)
    "product_attribute_value__product_id_fk_idx" btree (product_id)
"product_attribute_value__product_reference_id_fk_idx" btree (product_reference_id)
Foreign-key constraints:
"product_attribute_value_attribute_fk" FOREIGN KEY (attribute_id) REFERENCES attribute(attribute_id) DEFERRABLE INITIALLY DEFERRED "product_attribute_value_attributeunit_fk" FOREIGN KEY (attribute_unit_id) REFERENCES attribute_unit(attribute_unit_id) DEFERRABLE INITIALLY DEFERRED "product_attribute_value_attributevalue_fk" FOREIGN KEY (attribute_value_id) REFERENCES attribute_value(attribute_value_id) DEFERRABLE INITIALLY DEFERRED "product_attribute_value_product_fk" FOREIGN KEY (product_id) REFERENCES product(product_id) DEFERRABLE INITIALLY DEFERRED "product_attribute_value_productreference_fk" FOREIGN KEY (product_reference_id) REFERENCES product(product_id) DEFERRABLE INITIALLY DEFERRED


Not sure if that helps answer your question, but the query is pretty slow. Sometimes it takes 5 - 15 seconds depending on the category_id specified.

Thanks,

____________________________________________________________________
Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan@xxxxxxxxxxxxxx

ClickSpace Interactive Inc.
Suite L100, 239 - 10th Ave. SE
Calgary, AB  T2G 0V9

http://www.clickspace.com

On Mar 31, 2006, at 8:59 AM, Jim C. Nasby wrote:

What's the correlation of category_id? The current index scan cost
estimator places a heavy penalty on anything with a correlation much
below about 90%.

On Wed, Mar 29, 2006 at 08:12:28PM -0700, Brendan Duddridge wrote:
Hi,

I have a query that is using a sequential scan instead of an index
scan. I've turned off sequential scans and it is in fact faster with
the index scan.

Here's my before and after.

Before:

ssdev=# SET enable_seqscan TO DEFAULT;
ssdev=# explain analyze select cp.product_id
		from category_product cp, product_attribute_value pav
		where cp.category_id = 1001082 and cp.product_id =
		pav.product_id;


           QUERY PLAN
--------------------------------------------------------------------- --- --------------------------------------------------------------------- ---
------------------------------
Hash Join  (cost=25.52..52140.59 rows=5139 width=4) (actual
time=4.521..2580.520 rows=19695 loops=1)
   Hash Cond: ("outer".product_id = "inner".product_id)
   ->  Seq Scan on product_attribute_value pav  (cost=0.00..40127.12
rows=2387312 width=4) (actual time=0.039..1469.295 rows=2385846 loops=1)
   ->  Hash  (cost=23.10..23.10 rows=970 width=4) (actual
time=2.267..2.267 rows=1140 loops=1)
         ->  Index Scan using x_category_product__category_id_fk_idx
on category_product cp  (cost=0.00..23.10 rows=970 width=4) (actual
time=0.122..1.395 rows=1140 loops=1)
               Index Cond: (category_id = 1001082)
Total runtime: 2584.221 ms
(7 rows)


After:

ssdev=# SET enable_seqscan TO false;
ssdev=# explain analyze select cp.product_id
		from category_product cp, product_attribute_value pav
		where cp.category_id = 1001082 and cp.product_id =
		pav.product_id;


              QUERY PLAN
--------------------------------------------------------------------- --- --------------------------------------------------------------------- ---
-------------------------------------
Nested Loop  (cost=0.00..157425.22 rows=5139 width=4) (actual
time=0.373..71.177 rows=19695 loops=1)
   ->  Index Scan using x_category_product__category_id_fk_idx on
category_product cp  (cost=0.00..23.10 rows=970 width=4) (actual
time=0.129..1.438 rows=1140 loops=1)
         Index Cond: (category_id = 1001082)
   ->  Index Scan using product_attribute_value__product_id_fk_idx
on product_attribute_value pav  (cost=0.00..161.51 rows=61 width=4)
(actual time=0.016..0.053 rows=17 loops=1140)
         Index Cond: ("outer".product_id = pav.product_id)
Total runtime: 74.747 ms
(6 rows)

There's quite a big difference in speed there. 2584.221 ms vs. 74.747
ms.

Any ideas what I can do to improve this without turning sequential
scanning off?

Thanks,

____________________________________________________________________
Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan@xxxxxxxxxxxxxx

ClickSpace Interactive Inc.
Suite L100, 239 - 10th Ave. SE
Calgary, AB  T2G 0V9

http://www.clickspace.com




--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@xxxxxxxxxxxxx
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
       message can get through to the mailing list cleanly


Attachment: smime.p7s
Description: S/MIME cryptographic signature


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux