Re: RAID arrays and performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



James Mansion wrote:
Mark Mielke wrote:
This assumes that you can know which pages to fetch ahead of time - which you do not except for sequential read of a single table.
Why doesn't it help to issue IO ahead-of-time requests when you are scanning an index? You can read-ahead in index pages, and submit requests for data pages as soon as it is clear you'll want them. Doing so can allow the disks and OS to relax the order in which you receive them, which may allow you to process them while IO continues, and it may also optimise away some seeking and settle time. Maybe.
Sorry to be unclear. To achieve a massive speedup (12X for 12 disks with RAID 0) requires that you know what reads to perform in advance. The moment you do not, you only have a starting point, your operations begin to serialize again. For example, you must scan the first index, to be able to know what table rows to read. At a minimum, this breaks your query into: 1) Preload all the index pages you will need, 2) Scan the index pages you needed, 3) Preload all the table page you will need, 4) Scan the table pages you needed. But do you really need the whole index? What if you only need parts of the index, and this plan now reads the whole index using async I/O "just in case" it is useful? Index is a B-Tree for a reason. In Matthew's case where he has an IN clause with thousands of possibles (I think?), perhaps a complete index scan is always the best case - but that's only one use case, and in my opinion, an obscure one. As soon as additional table joins become involved, the chance that whole index scans are required would probably normally reduce, which turns the index scan into a regular B-Tree scan, which is difficult to perform async I/O for, as you don't necessarily know which pages to read next.

It seems like a valuable goal - but throwing imaginary numbers around does not appeal to me. I am more interested in Gregory's simulations. I would like to understand his simulation better, and see his results. Speculation about amazing potential is barely worth the words used to express it. The real work is in design and implementation. :-)

Cheers,
mark

--
Mark Mielke <mark@xxxxxxxxx>

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux