On Wed, Feb 13, 2013 at 10:42 AM, Carlos Henrique Reimer <carlos.reimer@xxxxxxxxxxxxx> wrote: > Hi, > > I`m trying to figure out why a query runs in 755ms in the morning and > 20054ms (26x) in the evening. > > > Morning: > > Sort (cost=151845.90..152304.21 rows=183322 width=62) (actual > time=706.676..728.080 rows=32828 loops=1) ( index scan plan ) > > Evening: > > Sort (cost=321670.51..322111.45 rows=176377 width=62) (actual > time=20010.616..20031.887 rows=32840 loops=1) (sequential scan) So it believes the evening plan (seq scan) to be twice as slow as the morning plan (index scan), but uses it anyway. So it must think the morning plan would be even slower than that yet if it were run in the evening. Do you change your database's cost settings between morning and evening (for example, in preparation for night-time batch processing)? It would interesting see what it thinks of the index plan at the time which it is choosing the sequential scan plan. In the evening, can you rerun the query to get the sequential plan, then "set enable_seqscan=off" and run it again to get the index plan under the same conditions? > > We initially suspected the reason could be that in the morning all data is > in memory and in the evening not all is in memory but as database size is > 40GB and memory 64GB I would eliminate this hypothesis . I wouldn't eliminate it for that reason. Just because you have 64GB doesn't mean the kernel is willing to use all of it for file caching, we've seen complaints that some kernels under some settings are not adept at using that much memory. The reason to eliminate that theory is that you get two different execution plans. PostgreSQL must think something is different in order to change plan, and empirical degree of cachedness is not something PostgreSQL is cognizant of, so it must be something else that is different. > Another reason we > rejected this hypothesis is that even if you run the query two times, both > took almost the same time. That doesn't mean much for the sequential scan plans. PostgreSQL has a "ring buffer" mechanism that prevents sequential scans from shoving everything else out of the cache. If the OS has some mechanism with a similar goal in mind for the part of the cache it controls, then sequential plans might not become well cached even after several repeated executions. > Another possibility is a CPU bottleneck but as there is no indication of > this condition in the performance data collected by sar, top, vmstat we > assume the problem has another origin. Could you show us some of that info? sar averages over ten minutes, so a 20 second query probably isn't going to show up in it, unless you repeat it endlessly. > How could we determine why this difference in the response time? You could run "explain (analyze,buffers)", and also turn on track_io_timing, Cheers, Jeff -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general