Search Postgresql Archives

Re: problems maintaining boolean columns in a large table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ben,

could you post your trigger function? When you need to rebuild the index,
you could disable the trigger setting the flag if the article is modified.
This could speed up your UPDATE.



Mit freundlichen Grüßen
Timo Klecker 




-----Ursprüngliche Nachricht-----
Von: pgsql-general-owner@xxxxxxxxxxxxxx
[mailto:pgsql-general-owner@xxxxxxxxxxxxxx] Im Auftrag von Ben Campbell
Gesendet: Dienstag, 9. Februar 2010 12:26
An: pgsql-general@xxxxxxxxxxxxxx
Betreff:  problems maintaining boolean columns in a large table

I've got a database that holds a bunch of articles in a table called 
'article'. It has a bunch of columns, and each row might hold a few KB 
of data, say.
I'm maintaining a separate fulltext database, and so I added a boolean 
flag, 'needs_indexing' to my 'article' table to keep track of which 
articles have been indexed (and I have some trigger functions on 
'article' to automatically set the flag if the article is modified).

It all works fine.
Except when I want to rebuild my index from scratch. I need to set all 
those flags, but it takes _ages_  to do "UPDATE article SET 
needs_indexing=true;" (many hours at least - I've never let it run to 
completion)

I _think_ the reason it takes so long is that postgresql doesn't modify 
rows in place - it creates an entry for the modified row and zaps the 
old one. So by touching _every_ row I'm basically forcing it to rebuild 
my whole database... I've got about 2 million rows in 'articles'.
There are a few indexes on columns in 'articles' which obviously will 
slow things down too.

I've had a minor attempt at tuning (increased checkpoint_segments) an d 
I'm sure there are a bunch of other tricks I could use to bulk-set that 
flag in much less time...

But my gut feeling is that the flag would be better off in it's own 
table anyway, eg:

CREATE TABLE needs_indexing (
   article_id integer references article(id)
);

So, if an article is listed in this table, it needs indexing.
(maybe with a constraint to ensure uniqueness - I only need articles 
entered once in this table)

Does this sound like a reasonable way to go?
Any advice or insight welcome!

Thanks,
Ben.

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux