Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > "Kevin Grittner" <Kevin.Grittner@xxxxxxxxxxxx> writes: >> Tom Lane <tgl@xxxxxxxxxxxxx> wrote: >>> Any sane text search application is going to try to filter out >>> common words as stopwords; it's only the failure to do that that's >>> making this run slow. > >> I'd rather have the index used for the selective test, and apply >> the remaining tests to the rows retrieved from the heap. > > Uh, that was exactly my point. Indexing common words is a waste. Perhaps I'm missing something. My point was that there are words which are too common to be useful for index searches, yet uncommon enough to usefully limit the results. These words could typically benefit from tsearch2 style parsing and dictionaries; so declaring them as stop words would be bad from a functional perspective, yet searching an index for them would be bad from a performance perspective. One solution would be for the users to rigorously identify all of these words, include them on one stop word list but not another, include *two* tsvector columns in the table (with and without the "iffy" words), index only the one with the larger stop word list, and generate two tsquery values to search the two different columns. Best of both worlds. Sort of. The staff time to create and maintain such a list would obviously be costly and writing the queries would be error-prone. Second best would be to somehow recognize the "iffy" words and exclude them from the index and the index search phase, but apply the check when the row is retrieved from the heap. I really have a hard time seeing how the conditional exclusion from the index could be accomplished, though. Next best would be to let them fall into the index, but exclude top level ANDed values from the index search, applying them only to the recheck when the row is read from the heap. The seems, at least conceptually, like it could be done. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance