Hello, We are using Postgres 9.1.4. We are struggling with a class of queries that got impossible to run after sharding a large table. Everything like: select small.something, big.anything from small join big on small.big_id = big.id; and variation such as "select * from big where id in (select big_id from small)" Since "big" was sharded, the query plan results in something like: Hash Join (cost=10000000001.23..30038997974.72 rows=10 width=753) Hash Cond: (b.id = i.big_id) -> Append (cost=0.00..20038552251.23 rows=118859245 width=11) -> Index Scan using big_201207_pkey on big_201207 b (cost=0.00..2224100.46 rows=1609634 width=12) -> Index Scan using big_201101_pkey on big_201101 b (cost=0.00..404899.71 rows=5437497 width=12) -> Index Scan using big_201104_pkey on big_201104 b (cost=0.00..349657.58 rows=4625181 width=12) -> [...all the shards] -> Hash (cost=10000000001.10..10000000001.10 rows=10 width=742) -> Seq Scan on small i (cost=10000000000.00..10000000001.10 rows=10 width=742) Postgres ends up in never-ending reads: even if "small" has only three rows I've never seen such query finishing, the time passed being even longer than a full scan on big. The plan looks sub-optimal, as it seems it first does a huge indexscan of all the partitions, then it joins the result against a small hash. 1. Can we fix the queries to work around this problem? 2. Could the planner be fixed for this scenario for PG 9.2 (or 9.3)? Creating the hash beforehand, performing an hash join for each partition and merging the results looks like it would bring it back into the realm of the runnable queries. Am I wrong? Thank you very much. -- Daniele -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance