Greg, > 1) Collect up data about their system (memory, disk layout), find out a > bit about their apps/workload, and generate a config file based on that. We could start with this. Where I bogged down is that collecting system information about several different operating systems ... and in some cases generating scripts for boosting things like shmmax ... is actually quite a large problem from a slog perspective; there is no standard way even within Linux to describe CPUs, for example. Collecting available disk space information is even worse. So I'd like some help on this portion. I actually have algorithms which are "good enough to start with" for most of the important GUCs worked out, and others could be set through an interactive script ("Does your application require large batch loads involving thousands or millions of updates in the same transaction?" "How large (GB) do you expect your database to be?") > 2) Connect to the database and look around. Study the tables and some > their stats, make some estimates based on what your find, produce a new > config file. I'm not sure that much more for (2) can be done than for (1). Tables-on-disk don't tell us much. > 3) Monitor the database while it's doing its thing. See which parts go > well and which go badly by viewing database statistics like pg_statio. > From that, figure out where the bottlenecks are likely to be and push more > resources toward them. What I've been working on lately is exposing more > readouts of performance-related database internals to make this more > practical. We really should collaborate on that. > When first exposed to this problem, most people assume that (1) is good > enough--ask some questions, look at the machine, and magically a > reasonable starting configuration can be produced. It's already been > pointed out that anyone with enough knowledge to do all that can probably > spit out a reasonable guess for the config file without help. But that's actually more than most people already do. Further, if you don't start with a "reasonable" configuration, then it's difficult-impossible to analyze where your settings are out-of-whack; behavior introduced by some way-to-low settings will mask any other tuning that needs to be done. It's also hard/impossible to devise tuning algorithms that work for both gross tuning (increase shared_buffers by 100x) and fine tuning (decrease bgwriter_interval to 45ms). So whether or not we do (3), we need to do (1) first. -- Josh Berkus PostgreSQL @ Sun San Francisco