Hi, Bucky, Bucky Jordan wrote: > Each postgres process also uses shared memory (aka the buffer cache) so > as to not fetch data that another process has already requested, > correct? Yes. Additinally, the OS caches disk blocks. Most unixoid ones like Linux use (nearly) all unused memory for this block cache, I don't know about Windows. > Commonly these are referred to as OLAP applications, correct? Which is > where I believe my application is more focused (it may be handling some > transactions in the future, but at the moment, it follows the "load lots > of data, then analyze it" pattern). Yes, most OLAP apps fall into this category. But I also think that most OLAP apps mainly generate sequential data access (sequential scans), for which the OS prefetching of data works fine. Btw, some weeks ago, there was a patch mentioned here that improves the linux kernel I/O scheduler wr/t those prefetching capabilities. > Does this have anything to do with postgres indexes not storing data, as > some previous posts to this list have mentioned? (In otherwords, having > the index in memory doesn't help? Or are we talking about indexes that > are too large to fit in RAM?) Yes, it has, but only for the cases where your query fetches only columns in that index. In case where you fetch other columns, PostgreSQL has to access the Heap nevertheless to fetch those. The overhead for checking outdated row versions (those that were updated or deleted, but not yet vacuumed) is zero, as those "load bulk, then analyze" applications typically don't create invalid rows, so every row fetched from the heap is valid. This is very different in OLTP applications. > So this issue would be only on a per query basis? Could it be alleviated > somewhat if I ran multiple smaller queries? For example, I want to > calculate a summary table on 500m records- fire off 5 queries that count > 100m records each and update the summary table, leaving MVCC to handle > update contention? Yes, you could do that, but only if you're CPU bound, and have a multi-core machine. And you need table partitioning, as LIMIT/OFFSET is expensive. Btw, the Bizgres people do exactly this under their hood, so it may be worth a look. If you're I/O bound, and your query is a full table scan, or something else that results in (nearly) sequential disk access, the OS prefetch algorithm will work. You can use some I/O monitoring tools to compare the actual speed the data comes in when PostgreSQL does the sequential scan, and compare it to DD'ing the database table files. For simple aggregates like sum(), you usually get near the "raw" speed, and the real bottlenecks are the disk I/O rate, bad RAID implementations or PCI bus contention. > Actually, now that I think about it- that would only work if the > sections I mentioned above were on different disks right? So I would > actually have to do table partitioning with tablespaces on different > spindles to get that to be beneficial? (which is basically not feasible > with RAID, since I don't get to pick what disks the data goes on...) If you really need that much throughput, you can always put the different partitions on different RAIDs. But hardware gets very expensive in those dimensions, and it may be better to partition the data on different machines altogether. AFAIK, Bizgres MPP does exactly that. > Are there any other workarounds for current postgres? Are your questions of theoretical nature, or do you have a concrete problem? In latter case, you could post your details here, and we'll see whether we can help. Btw, I'm not related with Bizgres in any way. :-) HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org