On Thu, Sep 28, 2006 at 01:47:44PM -0400, Carlo Stonebanks wrote: > > How are you loading the tables? Copy? Insert? > > Once the data is transformed, it is inserted. I don't have stats, but the > programs visual feedback does not spend a lot of time on the "inserting > data" message. Then again, if there is an asynchronous component to an > insert, perhaps I am not seeing how slow an insert really is until I query > the table. Well, individual inserts are slow, especially if they're not wrapped up in a transaction. And you also mentioned checking for dupes. I suspect that you're not going to find any huge gains in tuning the database... it sounds like the application (as in: how it's using the database) is what needs help. > >> work_mem = 32768 > > > > Depending on what you are doing, this is could be to low or to high. > > Is this like "You could be too fat or too thin"? Aren't you impressed with > the fact that I managed to pick the one number that was not right for > anything? For what you're doing, it's probably fine where it is... but while you're in the single-thread case, you can safely make that pretty big (like 1000000). > > > >> maintenance_work_mem = 32768 > >> checkpoint_segments = 128 > >> effective_cache_size = 10000 > > > > This coudl probably be higher. I'd suggest setting it to about 3G, or 375000. > > > >> random_page_cost = 3 > >> stats_start_collector = on > >> stats_command_string = on > >> stats_row_level = on > >> autovacuum = on > > > > Stats are a hit... you need to determine if you actually need them. > > Unfortunately, this is the only way I know of of getting the query string to > appear in the PostgreSQL server status display. While trying to figure out > what is slowing things down, having that is really helpful. I also imagined > that this sort of thing would be a performance hit when you are getting lots > of small, concurrent queries. In my case, we have queries which are taking > around a second to perform outer joins. They aren't competing with any other > requests as the site is not running, we are just running one app to seed the > data. stats_command_string can extract a huge penalty pre-8.2, on the order of 30%. I'd turn it off unless you *really* need it. Command logging (ie: log_min_duration_statement) is much less of a burden. The fact that you're doing outer joins while loading data really makes me suspect that the application needs to be changed for any real benefits to be had. But you should still look at what EXPLAIN ANALYZE is showing you on those queries; you might be able to find some gains there. -- Jim Nasby jim@xxxxxxxxx EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)