On Thu, Aug 16, 2012 at 1:54 PM, Wells Oliver <wellsoliver@xxxxxxxxx> wrote: > Hey folks, a question. We have a table that's getting large (6 million rows > right now, but hey, no end in sight). Does it grow in chunks, or one row at a time? > It's wide-ish, too, 98 columns. How many of the columns are NULL for any given row? Or perhaps better, what is the distribution of values for any given column? For a given column, is there some magic value (NULL, 0, 1, -1, 9999, '') which most of the rows have? > The problem is that each of these columns needs to be searchable quickly at > an application level, and I'm far too responsible an individual to put 98 > indexes on a table. That is somewhat melodramatic. Sure, creating 98 indexes does not come for free. And it is great that you are aware of this. But just because they are not free does not mean they are not worth their cost. Look at all the other costs of using a RDBMS. Each letter of ACID does not come for free. But it is often worth the price. In the generic case, you have a large amount of data to index. Indexing a lot of data requires a lot of resources. There is magic bullet to this. > Wondering what you folks have come across in terms of > creative solutions that might be native to postgres. I can build something > that indexes the data and caches it and runs separately from PG, but I > wanted to exhaust all native options first. If the data is frequently updated/inserted, then how would you invalidate the cache when needed? And if the data is not frequently updated/inserted, then what about the obvious PG solution (building 96 indexes) is a problem? If your queries are of the nature of: where col1=:1 or col2=:1 or col3=:1 or ... col96=:1 or then a full text index would probably be a better option. Otherwise, it is hard to say. You could replace 96 columns with a single hstore column which has 96 different keys. But from what I can tell, maintaining a gin index on that hstore column would probably be slower than maintaining 96 individual btree indexes. And if you go with a gist index on the single hstore column, the cost of maintenance is greatly reduced relative to gin. But the index is basically useless, you might as well just drop the index and do the full table scan instead. Cheers, Jeff -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general