On Tue, Apr 20, 2010 at 1:39 PM, David Kerr <dmk@xxxxxxxxxxxxxx> wrote: > Howdy all, > > I've got a huge server running just postgres. It's got 48 cores and 256GB of ram. Redhat 5.4, Postgres 8.3.9. > 64bit OS. No users currently. > > I've got a J2EE app that loads data into the DB, it's got logic behind it so it's not a simple bulk load, so > i don't think we can use copy. > > Based on the tuning guides, it set my effective_cache_size to 128GB (1/2 the available memory) on the box. > > When I ran my load, it took aproximately 15 hours to do load 20 million records. I thought this was odd because > on a much smaller machine I was able to do that same amount of records in 6 hours. > > My initial thought was hardware issues so we got sar, vmstat, etc all running on the box and they didn't give > any indication that we had resource issues. > > So I decided to just make the 2 PG config files look the same. (the only change was dropping effective_cache_size > from 128GB to 2GB). > > Now the large box performs the same as the smaller box. (which is fine). > > incidentally, both tests were starting from a blank database. > > Is this expected? Lowering effective_cache_size tends to discourage the planner from using a nested-loop-with-inner-indexscan plan - that's it. What may be happening is that you may be loading data into some tables and then running a query against those tables before the autovacuum daemon has a chance to analyze them. I suspect that if you enable some logging you'll find that one of those queries is really, really slow, and that (by happy coincidence) discouraging it from using the index it thinks it should use happens to produce a better plan. What you should probably do is, for each table that you bulk load and then query, insert a manual ANALYZE between the two. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance