On Mon, 2022-05-30 at 17:12 -0400, Jeff Janes wrote: > On Mon, May 30, 2022 at 3:40 PM jian xu <jamesxu@xxxxxxxxxxx> wrote: > > Thanks Laurenz. It is ok that these values are "arbitrary scale", but we still need > > to calibrate them to get the correct"arbitrary scale". For example if seq_page_cost > > is 1 and cpu_tuple_cost is 0.01, how to verify the io seq cost is 100 times slower than cpu access cost? > > "enable_seqscan" and "enable_nestloop" can work in some cases, but it doesn’t work with any cases. > > Does anyone have experience to calibrate the cost model parameters to get the correct value? Thanks > > I put a lot of time into it, and I think you will find that there is no consistent way to do so. > For cpu_tuple_cost, for example, are the hint bits already set? Do they need to get set? > Are the clog pages still in memory? Is there contention on the lock used to determine if a given > transaction is still running? How many columns does the table have? How many of them need to be > accessed for the current case, and how far left and right are they in the table and are they toasted? > > For seq reads, how much contention is there for the buffer mapping partition lock? What is the > throughput of your IO system? How many other seq reads will be occuring at the same time? Etc. > > For any change you intend to make, do you have a good enough load generator and test system set > up so you can test that it doesn't make something else worse? +1 I think that the values of those parameters are more set by experience and tradition than by measurable physical evidence. In a way, it just happens to work (mostly). For example, who says that on a spinning disk, random I/O is four times as slow as sequential I/O? Very likely, you have to factor in that part of the I/O requests are satisfied from the kernel page cache. Yours, Laurenz Albe