Re: Large tables (was: RAID 0 not as fast as

"Bucky Jordan" <bjordan@xxxxxxxxxx> · Thu, 21 Sep 2006 17:16:35 -0400

Markus,

First, thanks- your email was very enlightining. But, it does bring up a
few additional questions, so thanks for your patience also- I've listed
them below.

> It applies per active backend. When connecting, the Postmaster forks a
> new backend process. Each backend process has its own scanner and
> executor. The main postmaster is only for coordination (forking,
config
> reload etc.), all the work is done in the forked per-connection
backends.

Each postgres process also uses shared memory (aka the buffer cache) so
as to not fetch data that another process has already requested,
correct?

> Our discussion is about some different type of application, where you
> have a single application issuing a single query at a time dealing
with
> a large amount (several gigs up to teras) of data.
Commonly these are referred to as OLAP applications, correct? Which is
where I believe my application is more focused (it may be handling some
transactions in the future, but at the moment, it follows the "load lots
of data, then analyze it" pattern). 

> The discussed problem arises when such large queries generate random
> (non-continous) disk access (e. G. index scans). Here, the underlying
> RAID cannot effectively prefetch data as it does not know what the
> application will need next. This effectively limits the speed to that
of
> a single disk, regardless of the details of the underlying RAID, as it
> can only process a request at a time, and has to wait for the
> application for the next one.
Does this have anything to do with postgres indexes not storing data, as
some previous posts to this list have mentioned? (In otherwords, having
the index in memory doesn't help? Or are we talking about indexes that
are too large to fit in RAM?)

So this issue would be only on a per query basis? Could it be alleviated
somewhat if I ran multiple smaller queries? For example, I want to
calculate a summary table on 500m records- fire off 5 queries that count
100m records each and update the summary table, leaving MVCC to handle
update contention?

Actually, now that I think about it- that would only work if the
sections I mentioned above were on different disks right? So I would
actually have to do table partitioning with tablespaces on different
spindles to get that to be beneficial? (which is basically not feasible
with RAID, since I don't get to pick what disks the data goes on...)

Are there any other workarounds for current postgres?

Thanks again,

Bucky