James Mansion wrote:
Mark Mielke wrote:
At a minimum, this breaks your query into: 1) Preload all the index
pages you will need
Isn't this fairly predictable - the planner has chosen the index so it
will be operating
on a bounded subset.
What is the bounded subset? It is bounded by the value. What value? You
need to access the first page before you know what the second page is.
PostgreSQL or the kernel should already have the hottest pages in
memory, so the value of doing async I/O is very likely the cooler pages
that are unique to the query. We don't know what the cooler pages are
until we follow three tree down.
, 2) Scan the index pages you needed
Yes, and AIO helps when you can scan them in arbitrary order, as they
are returned.
I don't think you are talking about searching a B-Tree, as the order is
important when searching, and search performance would be reduced if one
reads and scans more pages than necessary to map from the value to the
row. I presume you are talking about scanning the entire index. Where
"needed" means "all". Again, this only benefits a subset of the queries.
, 3) Preload all the table page you will need
No - just request that they load. You can do work as soon as any page
is returned.
The difference between preload and handling async I/O in terms of
performance is debatable. Greg reports that async I/O on Linux sucks,
but posix_fadvise*() has substantial benefits. posix_fadvise*() is
preload not async I/O (he also reported that async I/O on Solaris seems
to work well). Being able to do work as the first page is available is a
micro-optimization as far as I am concerned at this point (one that may
not yet work on Linux), as the real benefit comes from utilizing all 12
disks in Matthew's case, not from guaranteeing that data is processed as
soon as possible.
, 4) Scan the table pages you needed.
In the order that is most naturally returned by the disks.
Micro-optimization.
But do you really need the whole index?
I don't think I suggested that.
What if you only need parts of the index, and this plan now reads the
whole index using async I/O "just in case" it is useful?
Where did you get that from?
I get it from your presumption that you can know which pages of the
index to load in advance. The only way you can know which pages must be
loaded, is to know that you need to query them all. Unless you have some
way of speculating with some degree of accuracy which colder pages in
the index you will need, without looking at the index?
index scan into a regular B-Tree scan, which is difficult to perform
async I/O for, as you don't necessarily know which pages to read next.
Most B-trees are not so deep. It would generally be a win to retain
interior nodes of indices in
shared memory, even if leaf pages are not present. In such a case, it
is quite quick to establish
which leaf pages must be demand loaded.
This is bogus. The less deep the B-Tree is, the less there should be any
requirement for async I/O. Hot index pages will be in cache.
I'm not suggesting that Postgres indices are structured in a way that
would support this sort
of thing now.
In your hand waving, you have assumed that PostgreSQL B-Tree index might
need to be replaced? :-)
Cheers,
mark
--
Mark Mielke <mark@xxxxxxxxx>
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings