Greg Smith wrote:
Matthew Wakeling wrote:
This sort of thing has been fairly well researched at an academic
level, but has not been implemented in that many real world
situations. I would encourage its use in Postgres.
I guess, but don't forget that work on PostgreSQL is driven by what
problems people are actually running into. There's a long list of
performance improvements sitting in the TODO list waiting for people
to find time to work on them, ones that we're quite certain are
useful. That anyone is going to chase after any of these speculative
ideas from academic research instead of one of those is unlikely.
Your characterization of the potential speed up here is "Using a
proper tree inside the index page would improve the CPU usage of the
index lookups", which seems quite reasonable. Regardless, when I
consider "is that something I have any reason to suspect is a
bottleneck on common workloads?", I don't think of any, and return to
working on one of things I already know is instead.
There are two different things concerning gist indexes:
1) with larger block sizes and hence, larger # entries per gist page,
results in more generic keys of those pages. This in turn results in a
greater number of hits, when the index is queried, so a larger part of
the index is scanned. NB this has nothing to do with caching / cache
sizes; it holds for every IO model. Tests performed by me showed
performance improvements of over 200%. Since then implementing a speedup
has been on my 'want to do list'.
2) there are several approaches to get the # entries per page down. Two
have been suggested in the thread referred to by Matthew (virtual pages
(but how to order these?) and tree within a page). It is interesting to
see if ideas from Prokop's cache oblivous algorithms match with this
problem to find a suitable virtual page format.
regards,
Yeb Havinga
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance