On Thu, Oct 11, 2012 at 11:17 AM, Josh Berkus <josh@xxxxxxxxxxxx> wrote: > >> Does anyone see effective_cache_size make a difference anyway? If so, >> in what circumstances? > > E_C_S, together with random_page_cost, the table and index sizes, the > row estimates and the cpu_* costs, form an equation which estimates the > cost of doing various kinds of scans, particularly index scan vs. table > scan. E_C_S only comes into play when the same table pages are (predicted to be) visited repeatedly during the index scan, but this is the same situation in which a bitmap scan is generally preferred anyway. In fact the two seem to be conceptually very similar (either avoid actually visiting the block repeatedly, or avoid the IO cost of visiting the block repeatedly), and I'm not sure why bitmap scans comes out on top--there doesn't seem to be a CPU cost estimate of visiting a block which is assumed to already be in memory, nor is bitmap scan given credit for the use of effective_io_concurrency. But I found a simple case (over in "Unused index influencing sequential scan plan") which is very sensitive to E_C_S. When the index scan is being done to avoid a costly sort or aggregation, then it can't be usefully replaced with a bitmap scan since it won't produce index-order sorted output. >> In my hands, queries for which effective_cache_size might come into >> play (for deciding between seq scan and index scan) are instead >> planned as bitmap scans. > > You have a very unusual workload, or a very small database. I think all real workloads are unusual, otherwise benchmarking would be easy...but since complex queries are intractable to figure out what the planner is thinking, I'm biased to using simple ones when trying to figure out general principles. I can make the database look as big or small as I want (relative to RAM), by feeding effective_cache_size false information. Anyway, it seems like the consequences of overestimating E_C_S (by underestimating the number of processes that might expect to benefit from it concurrently) are worse than the consequences of underestimating it--assuming you have the types of queries for which it makes much of a difference. Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance