>-----Original Message----- >From: Alvaro Herrera [mailto:alvherre@xxxxxxxxxxxxxxxxx] >Sent: dinsdag 9 januari 2007 3:43 >To: Joris Dobbelsteen >Cc: Chris Browne; pgsql-general@xxxxxxxxxxxxxx >Subject: Re: [GENERAL] Autovacuum Improvements > >Joris Dobbelsteen wrote: >> Why not collect some information from live databases and >perform some >> analysis on it? > >We already capture and utilize operational data from >databases: [snip] What I was point out is that you have collected the metrics required to establish a maintenance policy (for vacuum in this case). This mean others can collect it too. >But that data alone is not sufficient. You (the DBA) have to >provide the system with timing information (e.g., at what time >is it appropriate to vacuum huge tables). [snip] That would be your expected result of what you want to happen. So there are a few more metrics that the DBA has to set. What is wrong with building a model and performing some analysis on this model. If it suits you situation it that would be a starting point. >Capturing data about someone's database is not going to help >someone else's vacuuming strategy, because their usage >patterns are potentially so different; and there are as many >usage patterns as Imelda Marcos had shoes (well, maybe not >that many), so any strategy that considers only two particular >pairs of shows is not going to fly. We need to provide enough >configurability to allow DBAs to make the vacuumer fit their situation. Now we have at least one different model, lets mix in other captures and situations. So it cannot be done with only YOUR data, I fully agree. But if you have sufficient data you can find the generalization of the model to make it work (resonable) in sufficient situations. Of course models need time to evolve, but so does the implementation currently at a slow rate. From do it yourself, to scripts, to the current autovacuum integration (which is good). From doing all tables sequentially to having some intelligence by update thresholds, to what will be next. I think you should better solve the problem is this ways, as models are relative easy to compare compared to arguments without analyzable/simulatible data. - Joris