Re: RAID arrays and performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



James Mansion wrote:
Mark Mielke wrote:
At a minimum, this breaks your query into: 1) Preload all the index pages you will need
Isn't this fairly predictable - the planner has chosen the index so it will be operating
on a bounded subset.
What is the bounded subset? It is bounded by the value. What value? You need to access the first page before you know what the second page is. PostgreSQL or the kernel should already have the hottest pages in memory, so the value of doing async I/O is very likely the cooler pages that are unique to the query. We don't know what the cooler pages are until we follow three tree down.

, 2) Scan the index pages you needed
Yes, and AIO helps when you can scan them in arbitrary order, as they are returned.

I don't think you are talking about searching a B-Tree, as the order is important when searching, and search performance would be reduced if one reads and scans more pages than necessary to map from the value to the row. I presume you are talking about scanning the entire index. Where "needed" means "all". Again, this only benefits a subset of the queries.

, 3) Preload all the table page you will need
No - just request that they load. You can do work as soon as any page is returned.

The difference between preload and handling async I/O in terms of performance is debatable. Greg reports that async I/O on Linux sucks, but posix_fadvise*() has substantial benefits. posix_fadvise*() is preload not async I/O (he also reported that async I/O on Solaris seems to work well). Being able to do work as the first page is available is a micro-optimization as far as I am concerned at this point (one that may not yet work on Linux), as the real benefit comes from utilizing all 12 disks in Matthew's case, not from guaranteeing that data is processed as soon as possible.

, 4) Scan the table pages you needed.
In the order that is most naturally returned by the disks.

Micro-optimization.

But do you really need the whole index?
I don't think I suggested that.
What if you only need parts of the index, and this plan now reads the whole index using async I/O "just in case" it is useful?
Where did you get that from?

I get it from your presumption that you can know which pages of the index to load in advance. The only way you can know which pages must be loaded, is to know that you need to query them all. Unless you have some way of speculating with some degree of accuracy which colder pages in the index you will need, without looking at the index?

index scan into a regular B-Tree scan, which is difficult to perform async I/O for, as you don't necessarily know which pages to read next.
Most B-trees are not so deep. It would generally be a win to retain interior nodes of indices in shared memory, even if leaf pages are not present. In such a case, it is quite quick to establish
which leaf pages must be demand loaded.

This is bogus. The less deep the B-Tree is, the less there should be any requirement for async I/O. Hot index pages will be in cache.

I'm not suggesting that Postgres indices are structured in a way that would support this sort
of thing now.

In your hand waving, you have assumed that PostgreSQL B-Tree index might need to be replaced? :-)

Cheers,
mark

--
Mark Mielke <mark@xxxxxxxxx>

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux