Re: stored proc and inserting hundreds of thousands of rows

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[rearranging to correct for top-posting]
 
Joel Reymont <joelr1@xxxxxxxxx> wrote:
> Kevin Grittner <Kevin.Grittner@xxxxxxxxxxxx> wrote:
>> Joel Reymont <joelr1@xxxxxxxxx> wrote:
>>
>>> We have 2 million documents now and linking an ad to all of them
>>> takes 5 minutes on my top-of-the-line SSD MacBook Pro.
>>
>> How long does it take to run just the SELECT part of the INSERT
>> by itself?
 
> Are you suggesting eliminating the physical linking and
> calculating matching documents on the fly?
 
I'm not suggesting anything other than it being a good idea to
determine where the time is being spent before trying to make it
faster.  You showed this as the apparent source of the five minute
delay:
 
  INSERT INTO doc_ads (doc_id, ad_id, distance)
  SELECT doc, (t).ad_id, (t).distance
  FROM (SELECT ads_within_distance(topics, threshold) AS t
       FROM docs
       WHERE id = doc) AS x;
 
What we don't know is how much of that time is due to writing to the
doc_ads table, and how much is due to reading the other tables.  We
can find that out by running this:
 
  SELECT doc, (t).ad_id, (t).distance
  FROM (SELECT ads_within_distance(topics, threshold) AS t
       FROM docs
       WHERE id = doc) AS x;
 
If this is where most of the time is, the next thing is to run it
with EXPLAIN ANALYZE, and post the output.  It's a whole different
set of things to try to tune if that part is fast and the INSERT
itself is slow.
 
Of course, be aware of caching effects when you time this.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux