[rearranging to correct for top-posting] Joel Reymont <joelr1@xxxxxxxxx> wrote: > Kevin Grittner <Kevin.Grittner@xxxxxxxxxxxx> wrote: >> Joel Reymont <joelr1@xxxxxxxxx> wrote: >> >>> We have 2 million documents now and linking an ad to all of them >>> takes 5 minutes on my top-of-the-line SSD MacBook Pro. >> >> How long does it take to run just the SELECT part of the INSERT >> by itself? > Are you suggesting eliminating the physical linking and > calculating matching documents on the fly? I'm not suggesting anything other than it being a good idea to determine where the time is being spent before trying to make it faster. You showed this as the apparent source of the five minute delay: INSERT INTO doc_ads (doc_id, ad_id, distance) SELECT doc, (t).ad_id, (t).distance FROM (SELECT ads_within_distance(topics, threshold) AS t FROM docs WHERE id = doc) AS x; What we don't know is how much of that time is due to writing to the doc_ads table, and how much is due to reading the other tables. We can find that out by running this: SELECT doc, (t).ad_id, (t).distance FROM (SELECT ads_within_distance(topics, threshold) AS t FROM docs WHERE id = doc) AS x; If this is where most of the time is, the next thing is to run it with EXPLAIN ANALYZE, and post the output. It's a whole different set of things to try to tune if that part is fast and the INSERT itself is slow. Of course, be aware of caching effects when you time this. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance