Re: anti-join chosen even when slower than old plan

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Robert Haas <robertmhaas@xxxxxxxxx> writes:
> Yeah.  For Kevin's case, it seems like we want the caching percentage
> to vary not so much based on which table we're hitting at the moment
> but on how much of it we're actually reading.

Well, we could certainly take the expected number of pages to read and
compare that to effective_cache_size.  The thing that's missing in that
equation is how much other stuff is competing for cache space.  I've
tried to avoid having the planner need to know the total size of the
database cluster, but it's kind of hard to avoid that if you want to
model this honestly.

Would it be at all workable to have an estimate that so many megs of a
table are in cache (independently of any other table), and then we could
scale the cost based on the expected number of pages to read versus that
number?  The trick here is that DBAs really aren't going to want to set
such a per-table number (at least, most of the time) so we need a
formula to get to a default estimate for that number based on some simple
system-wide parameters.  I'm not sure if that's any easier.

BTW, it seems that all these variants have an implicit assumption that
if you're reading a small part of the table it's probably part of the
working set; which is an assumption that could be 100% wrong.  I don't
see a way around it without trying to characterize the data access at
an unworkably fine level, though.

			regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux