Search Postgresql Archives

Re: Optimising SELECT on a table with one million rows

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Cultural Sublimation wrote:
SELECT comments.comment_id, users.user_name
FROM comments, users
WHERE comments.comment_story = 100 AND comments.comment_author = users.user_id;

The problem is that this query takes a *very* long time.  With the said
1,000,000 comments, it needs at least 1100ms on my system.  "Explain
analyze" tells me that a sequential scan is being performed on both
users and comments:

Hash Join  (cost=28.50..21889.09 rows=988 width=14) (actual
time=3.674..1144.779 rows=1000 loops=1)
   Hash Cond: ((comments.comment_author)::integer = (users.user_id)::integer)
   ->  Seq Scan on comments  (cost=0.00..21847.00 rows=988 width=8) (actual
time=0.185..1136.067 rows=1000 loops=1)
         Filter: ((comment_story)::integer = 100)
   ->  Hash  (cost=16.00..16.00 rows=1000 width=14) (actual time=3.425..3.425
rows=1000 loops=1)
         ->  Seq Scan on users  (cost=0.00..16.00 rows=1000 width=14) (actual
time=0.068..1.845 rows=1000 loops=1)
 Total runtime: 1146.424 ms

If you have no index on comments.comment_author, then a seqscan will be required for your join between comments and users. Similarly, if you have no index on comments.comment_story, then any query against comments that uses that column as part of a predicate will require a seqscan of the comments table.

Note that an FK constraint does not automatically create an index on the underlying column. You need to create the actual index yourself if it will be necessary for your queries.

On the long run, I guess one possible solution to this problem will be
to partition the comments table into a number of sub-tables, most likely
based on the timestamp attribute (by having current versus historic data).

Partitioning on comments.comment_timestamp won't help you at all for this particular query, since you don't have a condition in your query dependent upon that value. It might help you for other queries (such as gathering up all the comments posted on a particular day, or during some other time range), but it won't make any positive difference for this query.

-Jon

--
Senior Systems Developer
Media Matters for America
http://mediamatters.org/

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux