Peter Geoghegan <pg@xxxxxxx> writes: > I wonder what it would take to teach the optimizer to consider the > possibility of a "collation strength reduction". In other words, for > aggregates that perform a sort (or for aggregates that rely on the > presence of a sort node without there being some other dependency on > the sort node), it should be possible for the optimizer to determine > that it would be just fine to use the C locale, since the user isn't > entitled to assume anything about the exact sort order. There are of > course cases where this can make a huge difference. IMO the way to handle this would be to consider both paths that use the straight sort order and paths that use COLLATE "C" ordering. I think the key structural limitation that makes it not straightforward is that the query_planner() API supports only one target ordering (root->query_pathkeys). I've had a bee in my bonnet for awhile about replacing that with a list of potentially-useful target orderings, but haven't got round to making it happen. Of course, we would also have to teach cost_sort or someplace near there that non-C sorting is much more expensive than C-collation sorting. Not sure about exactly how to set that up without it being a kluge. A related problem is that if you have "GROUP BY x,y" and no particular ORDER BY requirement, you could sort by either x,y or y,x before the GroupAgg. This would matter if, say, there was an index matching one but not the other. Right now we're very stupid and only consider x,y, but if there were room to consider more than one set of target pathkeys it would be fairly simple to make that better. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general