2011/5/13 Josh Berkus <josh@xxxxxxxxxxxx>: > >> I guess maybe the reason why it didn't matter for the OP is that - if >> the size of the index page in pages is smaller than the pro-rated >> fraction of effective_cache_size allowed to the index - then the exact >> value doesn't affect the answer. >> >> I apparently need to study this code more. > > FWIW: random_page_cost is meant to be the ratio between the cost of > looking up a single row as and index lookup, and the cost of looking up > that same row as part of a larger sequential scan. For specific > storage, that coefficient should be roughly the same regardless of the > table size. So if your plan for optimization involves manipulating RPC > for anything other than a change of storage, you're Doing It Wrong. > > Instead, we should be fixing the formulas these are based on and leaving > RPC alone. > > For any data page, there are actually four costs associated with each > tuple lookup, per: > > in-memory/seq | on disk/seq > ----------------+---------------- > in-memory/random| on disk/random it lacks some more theorical like sort_page/temp_page : those are based on a ratio of seq_page_cost and random_page_cost or a simple seq_page_cost (when working out of work_mem) memory access is accounted with some 0.1 in some place AFAIR. (and memory random/seq is the same at the level of estimations we do) > > (yes, there's actually more for bitmapscan etc. but the example holds) (if I read correctly the sources, for this one there is a linear approach to ponderate the cost between random_page cost and seq_page_cost on the heap page fetch plus the Mackert and Lohman formula, if needed, in its best usage : predicting what should be in cache *because* of the current query execution, not because of the current status of the page cache) > > For any given tuple lookup, then, you can assign a cost based on where > you think that tuple falls in that quadrant map. Since this is all > probability-based, you'd be assigning a cost as a mixed % of in-memory > and on-disk costs. Improvements in accuracy of this formula would come > through improvements in accuracy in predicting if a particular data page > will be in memory. > > This is what the combination of random_page_cost and > effective_cache_size ought to supply, but I don't think it does, quite. > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > > -- > Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ ; PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance