On 09/02/10 11:25, Ben Campbell wrote:
[snip]
I need to set all
those flags, but it takes _ages_ to do "UPDATE article SET
needs_indexing=true;"
[snip]
I _think_ the reason it takes so long is that postgresql doesn't modify
rows in place - it creates an entry for the modified row and zaps the
old one. So by touching _every_ row I'm basically forcing it to rebuild
my whole database... I've got about 2 million rows in 'articles'.
There are a few indexes on columns in 'articles' which obviously will
slow things down too.
Exactly so.
I've had a minor attempt at tuning (increased checkpoint_segments) an d
I'm sure there are a bunch of other tricks I could use to bulk-set that
flag in much less time...
Probably not, actually.
But my gut feeling is that the flag would be better off in it's own
table anyway, eg:
CREATE TABLE needs_indexing (
article_id integer references article(id)
);
So, if an article is listed in this table, it needs indexing.
(maybe with a constraint to ensure uniqueness - I only need articles
entered once in this table)
That sounds sensible to me - "needs indexing" depends on two things:
1. Is the article more recent than it's FTS index
2. Have you asked for it to be re-indexed anyway.
The first point concerns two tables, and the second is (from the
database point of view) arbitrary. A separate table containing indexing
jobs to do seems the sensible approach. Lets you batch up your
re-indexing work too.
Oh - you might want to consider how/whether to handle multiple entries
for the same article in your queue.
--
Richard Huxton
Archonet Ltd
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general