On Thu, 2007-03-01 at 19:59 -0800, Adam L Beberg wrote: > On the surface, looks like a job for GIN, but GIN seems undocumented, > specifically mentions it doesn't support the deletes we'll have many of > since it's designed for word searching apparently, the performance It can delete an entry for one of the keys of an index, it just can't delete the key itself when the number of entries goes down to zero. Because you only have O(100K) possible keys, that shouldn't be a problem. The GIN indexes can reclaim space. If they couldn't, they wouldn't be nearly as useful. The time when you run into problems is when you have a huge, sparsely populated keyspace, with a huge number of keys contained by no tuples in the table. However, for your application, GIN still might not be the right answer. GIN can only return tuples which do contain some matching keys, it won't return the number of matching keys in that tuple (that's not the job of an index). Let's run some numbers: * percentage of tuples returned = 100K rows out of the 10M = 1% * tuples per page = 8192 bytes / 32 (tuple header) + 8 (bigint) + 80 (10 bigints) = ~70. Let's say it's 50 due to some free space. Based on those numbers, the GIN index is basically going to say "get every other page". PostgreSQL will optimize that into a sequential scan because it makes no sense to do a random fetch for every other page. So, the fastest way you can do this (that I can see) is just fetch every tuple and count the number of matches in each array. You know your data better than I do, so replace those numbers with real ones and see if it still makes sense. The basic rule is that an index scan is useful only if it reduces the number of disk pages you need to fetch enough to make up for the extra cost of random I/O. Regards, Jeff Davis