Markus, First, thanks- your email was very enlightining. But, it does bring up a few additional questions, so thanks for your patience also- I've listed them below. > It applies per active backend. When connecting, the Postmaster forks a > new backend process. Each backend process has its own scanner and > executor. The main postmaster is only for coordination (forking, config > reload etc.), all the work is done in the forked per-connection backends. Each postgres process also uses shared memory (aka the buffer cache) so as to not fetch data that another process has already requested, correct? > Our discussion is about some different type of application, where you > have a single application issuing a single query at a time dealing with > a large amount (several gigs up to teras) of data. Commonly these are referred to as OLAP applications, correct? Which is where I believe my application is more focused (it may be handling some transactions in the future, but at the moment, it follows the "load lots of data, then analyze it" pattern). > The discussed problem arises when such large queries generate random > (non-continous) disk access (e. G. index scans). Here, the underlying > RAID cannot effectively prefetch data as it does not know what the > application will need next. This effectively limits the speed to that of > a single disk, regardless of the details of the underlying RAID, as it > can only process a request at a time, and has to wait for the > application for the next one. Does this have anything to do with postgres indexes not storing data, as some previous posts to this list have mentioned? (In otherwords, having the index in memory doesn't help? Or are we talking about indexes that are too large to fit in RAM?) So this issue would be only on a per query basis? Could it be alleviated somewhat if I ran multiple smaller queries? For example, I want to calculate a summary table on 500m records- fire off 5 queries that count 100m records each and update the summary table, leaving MVCC to handle update contention? Actually, now that I think about it- that would only work if the sections I mentioned above were on different disks right? So I would actually have to do table partitioning with tablespaces on different spindles to get that to be beneficial? (which is basically not feasible with RAID, since I don't get to pick what disks the data goes on...) Are there any other workarounds for current postgres? Thanks again, Bucky