On Thursday 15 May 2008 03:02:19 Tom Lane wrote: > "Jan de Visser" <jdevisser@xxxxxxxxxxxxxxxxxx> writes: > > Obviously, this issue is tied to the slow count(*) one, as I found out > > the hard way. Consider the following scenario: > > * Insert row > > * Update that row a couple of times > > * Rinse and repeat many times > > > > Now somewhere during that cycle, do a select count(*) just to see > > where you are. You will be appalled by how slow that is, due to not > > only the usual 'slow count(*)' reasons. This whole hint bit business > > makes it even worse, as demonstrated by the fact that running a vacuum > > before the count(*) makes the latter noticably faster. > > Uh, well, you can't blame that entirely on hint-bit updates. The vacuum > has simply *removed* two-thirds of the rows in the system, resulting in > a large drop in the number of rows that the select even has to look at. > > It's certainly true that hint-bit updates cost something, but > quantifying how much isn't easy. The off-the-cuff answer is to do the > select count(*) twice and see how much cheaper the second one is. But > there are two big holes in that answer: the first is the possible cache > effects from having already read in the pages, and the second is that > the follow-up scan gets to avoid the visits to pg_clog that the first > scan had to make (which after all is the point of the hint bits). > > I don't know any easy way to disambiguate the three effects that are at > work here. But blaming it all on the costs of writing out hint-bit > updates is wrong. > > regards, tom lane True. But it still contributes to the fact that queries sometimes behave in a non-deterministic way, which IMHO is the major annoyance when starting to work with pgsql. And contrary to other causes (vacuum, checkpoints) this is woefully underdocumented. jan