Hi, I have a problem with multiple solutions, and my question concerns which solution is preferred given whatever optimizations happen inside the query planner. In a bug tracking system, I have the tables "users", "bugs" and "tags". One bug may have one user (the reporter) but multiple tags. Moreover, one tag may be associated with multiple bugs. There is therefore a many-to-many relation between bugs and tags. I use an extra table "bug_tags" to keep track of this relation, as shown below: CREATE TABLE bugs ( bid int4 UNIQUE NOT NULL, uid int4 REFERENCES users(uid) NOT NULL, title text NOT NULL, PRIMARY KEY (bid)); CREATE TABLE bug_tags ( bid int4 REFERENCES bugs(bid) NOT NULL, tid int4 REFERENCES tags(tid) NOT NULL, PRIMARY KEY (bid, tid)); Here's the problem: I want to retrieve a list of bugs (possibly) matching certain constraints. One possible constraint is a user ID: if given, only those bugs reported by the user will be returned. Another constraint is a set of tags: only those bugs that contain *all* the tags in the set must be returned (if the set is empty, then all bugs match). Moreover, the user and tags constraints are conjunctive: if both set, that means I want only the bugs by that user having also the given set of tags. Below is one possible solution using a CTE to fetch all bugs that match the tags constraint. (Note that $IDENTIFIER will be replaced by the actual contents of the variable using prepared statements). with tagged as (select bid from bug_tags inner join tags on bug_tags.tid = tags.tid group by bid having array_agg (tags.tag) @> $TAGS) select * from bugs where (($UID :: int4) is null or uid = $UID) and ($TAGS = (array[] :: text[]) or bid in (select bid from tagged)) The first alternative is shown below. The basic principle is the same, except that instead of a CTE there is a subquery. The idea is that the subquery will only be performed if the set of tags is non-empty. Or is the query planner smart enough to never execute the CTE if this is the case? If so, then both these solutions should be identical. select * from bugs where (($UID :: int4) is null or uid = $UID) and ($TAGS = (array[] :: text[]) or bid in ( select bid from bug_tags inner join tags on bug_tags.tid = tags.tid group by bid having array_agg (tags.tag) @> $TAGS)) The second alternative is shown below. This one uses a different approach: instead of constructing a set of bug IDs matching the tag constraints, it goes one by one through the bugs, verifying if each one matches the tag constraints. select * from bugs where (($UID :: int4) is null or uid = $UID) and ($TAGS = (array[] :: text[]) or $TAGSs <@ array (select tags.tag from bug_tags inner join tags on bug_tags.tid = tags.tid where bug_tags.bid = bugs.bid)) To conclude: is it worth worrying about such details, or can I assume the query optimizer will automagically choose the best approach. And if not, from which solution should I expect better results when a) there are no constraints, b) there are only user constraints, c) there are only tag constraints, and d) there are both user and tag constraints? Thank you in advance for any light you might be able to shed! Best, Jon P.S. Other solutions to a similar problem can be found on this thread: http://stackoverflow.com/questions/14527080/selecting-matching-subset-in-many-to-many-relation -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general