Hello, this is my first post to this particular list (I've also
posted to performance list a while ago to report a bug). I generate complex dynamic queries and paginate them using "LIMIT 50". I have a need for implementing some statistics for the results as well and I'd like to know if I can avoid extra queries or how to be able to do that without much performance hit. The statistics calculation that bothers me most is a distinct one. For example, suppose you have a query that would return more than 50 distinct values while only the first 50 ones are brought by the paginated query. When the user asks for its statistics it should show all counts by distinct value including those not present in the current results page for certain columns. I noticed that there is a plan for PG 9.2 to include a query_to_json function that would indeed help me to get this done when used altogether with a window function. Our application is currently running PG 9.1 (was rolled back from 9.2 due to a performance bug that seems to be now fixed but not released yet). Is there anyway I could avoid running another query to get this kind of statistics? The queries can become really complex and take almost 5s to complete in some cases. Which would mean that performing 2 queries with those same conditions would take about 10s... I could perform them in different times but then the user would have to wait for 5s just to get the statistics calculation... This is what I'd like to avoid. This application is a clustered web application, so I'd rather avoid any client library support for paginating results using cursors or something like that and stick with OFFSET and LIMIT. Any help is appreciated! Thanks, Rodrigo. |