Carsten Kropf <ckropf2@xxxxxxxxx> writes: > I am currently implementing some index access methods on top of > PostgreSQL. Until now, it is pretty fine and working > properly. However, I am now doing the implementation of bulk deletion > and vacuum of the structure. I don't know exactly, how to achieve this > because it would be much easier to just collect statistics in > bulkdelete and to implement the "real deal" of deleting the particular > entries from my structures when vacuum is called on the index. Is it > legitimate to do this: just collect statistics and pass the statistics > and items to be deleted in main memory back to the caller and perform > the real deletion of entries in vacuum? No. You *must* make the index entries go away during bulkdelete, because the heap tuples they are pointing at will be deleted as soon as it returns. If you don't do this, and there's a crash before the vacuum finishes, you have dangling index entries pointing at nonexistent heap entries, which will lead to big trouble later. I think you probably don't even need a crash to have trouble --- consider a concurrent indexscan query that finds one of those index entries and tries to visit the heap tuple from it. The other problem with your sketch is that you can't assume you have an indefinitely large amount of working memory available. Perhaps you could set a flag on each deleted index tuple during bulkdelete (with scans knowing to ignore marked tuples) and then do the physical reorganization at vacuum cleanup. This would imply doing a full scan of the index during cleanup (to find the dead entries) but we do similar things in btree indexes and the performance seems to be OK. BTW, this seems a bit off-topic for pgsql-general. You'd be better off asking such questions in -hackers. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general