The partial index is highly leveraged. If every tuple in the table is updated once, that amounts to every tuple in the index
> being updated 25,000 times.
How so? That sounds like O(n_2) behaviour. If the table has 5 million rows while the index has 200 (active) rows at any given time, then to update every row in the table to null and back again would be 100% turn over of the table. But each such change would lead to an addition and then a deletion from the index. So 100% turnover of the table would be a 5 million / 200 = 25,000 fold turn of the index.
Sorry, I was being dense. I misread that as: "every time a single tuple in the table is updated, the entire index (every row) is updated". Yes, of course your explanation makes sense.
There is some code that allows a btree index entry to get killed (and so the slot to be reused) without any vacuum, if a scan follows that entry and finds the corresponding tuple in the table no longer visible to anyone. I have not examined this code, and don't know whether it is doing its job but just isn't enough to prevent the bloat, or if for some reason it is not applicable to your situation.
It looks like my solution is going to be a REINDEX invoked from cron, or maybe just every 100k inserts.
In terms of trying to improve this behaviour for other PG users in the future, are there any more diagnostics I can do for you? Having found a special case, I'd like to help permanently resolve it if I can.
Thanks very much again. Best wishes, Richard -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance