On Fri, 2009-10-23 at 07:18 +0200, Jesper Krogh wrote: > This is indeed information on individual terms from the statistics that > enable this. My mistake, I didn't know it was that smart about it. > > In effect, what you want are words that aren't searched (or stored) in > > the index, but are included in the tsvector (so the RECHECK still > > works). That sounds like it would solve your problem and it would reduce > > index size, improve update performance, etc. I don't know how difficult > > it would be to implement, but it sounds reasonable to me. > That sounds like it could require an index rebuild if the distribution > changes? My thought was that the common words could be declared to be common the same way stop words are. As long as words are only added to this list, it should be OK. > That would be another plan to pursue, but the MCV is allready there The problem with MCVs is that the index search can never eliminate documents because they don't contain a match, because it might contain a match that was previously an MCV, but is no longer. Also, MCVs are relatively few -- you only get ~1000 or so. There might be a lot of common words you'd like to track. Perhaps ANALYZE can automatically add the common words above some frequency threshold to the list? Regards, Jeff Davis -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance