Greetings, * kpi6288@xxxxxxxxx (kpi6288@xxxxxxxxx) wrote: > The CTE mentioned below completes the query in 4.5 seconds while the regular > query takes 66 seconds. I read from EXPLAIN ANALYSE that the regular query > starts with a full table scan over "Doc" while the CTE joins the two tables > first and applies the filter condition in the 2nd step. > > I believe that some rows in "Doc" which are not referenced by "F" contain a > large amount of data in the field "szText" and this will slow down the ILIKE > operator. Yup, that appears to be what's happening. > What can I do to improve the performance of the regular query without using > a CTE? You could possibly build a trigram index on the field you're searching, which could avoid the full table scan. Of course, that index could be quite large, so there's downsides to that. If these are words you're looking for then you could use PG's full text indexing to build indexes on the words and then use that instead. If you are fine working with words but are concerned about misspellings then you can extract out the distinct words, build a trigram index on those, find the most similar words based on the input and then search for those words using the FTI. Unfortunately, we don't currently pay attention to things like average string length when considering the cost of performing an 'ilike', so we figure that doing the filtering first and then the join will be faster, but that obviously falls over in some cases, like this one. Using the CTE forces PG to (today, at least) do the join first, but that isn't really good to rely on. Thanks! Stephen
Attachment:
signature.asc
Description: PGP signature