On Tue, Mar 22, 2011 at 11:28 PM, Adarsh Sharma <adarsh.sharma@xxxxxxxxxx> wrote:
please describe what your query is trying to select, as it is possible that query isn't doing what you think it is. joining 2 tables where id1 != id2 will create a cross multiple of the two tables such that every row from the first table is matched with every single row from the second table that doesn't have a matching id. Then you are looking for distinct values on that potentially enormous set of rows.
I perform a join query on it as :
explain analyze select distinct(p.crawled_page_id) from page_content p , clause2 c where p.crawled_page_id != c.source_id ;
What it takes more than 1 hour to complete. As I issue the explain analyze command and cannot able to wait for output but I send my explain output as :
db_v2=# select * from table1;
id | value
----+-------
1 | 1
2 | 2
3 | 3
(3 rows)
db_v2=# select * from table2;
id | value
----+-------
1 | 4
2 | 5
3 | 6
(3 rows)
id | value | id | value
----+-------+----+-------
1 | 1 | 2 | 5
1 | 1 | 3 | 6
2 | 2 | 1 | 4
2 | 2 | 3 | 6
3 | 3 | 1 | 4
3 | 3 | 2 | 5
So if you have a couple of million rows in each table, you are selecting distinct over a potentially huge set of data. If you are actually trying to find all ids from one table which have no match at all in the other table, then you need an entirely different query: