On 2007-03-30, Peter Schuller <peter.schuller@xxxxxxxxxxxx> wrote: [...] > Other than absolute performance, an important goal is to be able to > scale fairly linearly with the number of underlying disk drives. We > are fully willing to take a disk seek per item selected, as long as it > scales. > > To this end I have been doing some benchmarking to see whether the > plan is going to be feasable. On a 12 disk hardware stripe, insertion > performance does scale somewhat with concurrent inserters. However, I > am seeing surprising effects with SELECT:s: a single selecter > generates the same amount of disk activity as two concurrent selecters > (I was easily expecting about twice). > > The query is simple: > > SELECT * FROM test WHERE value = 'xxx' LIMIT 1000; I tested this on a 14-way software raid10 on freebsd, using pg 8.1.6, and couldn't reproduce anything like it. With one client I get about 200 disk requests per second, scaling almost exactly linearly for the first 5 or so clients, as expected. At 14 clients it was down to about 150 reqs/sec per client, but the total throughput continued to increase with additional concurrency up to about 60 clients, giving about 3600 reqs/sec (260 per disk, which is about right for 10krpm scsi disks under highly concurrent random loads). > So my first question is - why am I not seeing this scaling? A good question. Have you tried testing the disks directly? e.g. create some huge files, and run a few concurrent random readers on them? That would test the array and the filesystem without involving postgres. > Secondly, I am seeing a query plan switch after a certain > threshold. Observe: [snip index scan changing to bitmapscan] This is entirely expected. With the larger row count, it is more likely (or so the planner estimates) that rows will need to be fetched from adjacent or at least nearby blocks, thus a plan which fetches rows in physical table order rather than index order would be expected to be superior. The planner takes into account the estimated startup cost and per-row cost when planning LIMIT queries; therefore it is no surprise that for larger limits, it switches to a plan with a higher startup cost but lower per-row cost. > The interesting part is that the latter query is entirely CPU bound > (no disk I/O at all) for an extended period of time before even > beginning to read data from disk. Most likely your index is small enough that large parts of it will be cached in RAM, so that the scan of the index to build the bitmap does not need to hit the disk much if at all. > And when it *does* start performing > disk I/O, the performance is about the same as for the other case. In > other words, the change in query plan seems to do nothing but add > overhead. This is going to depend quite a bit on the physical ordering of the data relative to the indexed values. > What is the bitmap heap scan supposed to be doing that would increase > performance above a "seek once per matching row" plan? I haven't been > able to Google my way to what the intended benefit is of a heap scan > vs. a plain index scan. The bitmap scan visits the heap in heap order, rather than index order, thus enabling it to take advantage of prefetch and other sequential-read optimizations (in the underlying OS and disk subsystem, rather than in pg itself). -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services