On 10/20/07, Pavel Velikhov <pvelikhov@xxxxxxxxx> wrote:
> Left the query running for 10+ hours and had to kill it. I guess there
> really was no need to have lots of
> shared buffers (the hope was that postgresql will cache the whole table). I
> ended up doing this step inside
> the application as a pre-processing step. Can't have postgres running with
> different fsych options since this
> will be part of an "easy to install and run" app, that should just require a
> typical PosgreSQL installation.
>Is the size always different? If not, you could limit the updates:
>UPDATE links
> SET target_size = size
>FROM articles
>WHERE articles.article_id = links.article_to
> AND links.target_size != articles.size;
Ah, this sounds better for sure! But its probably as good as the scan with an index-scan subquery I was getting before...
>Since this is a huge operation, what about trying:
>CREATE TABLE links_new AS SELECT l.col1, l.col2, a.size as
>target_size, l.col4, ... FROM links l, articles a WHERE a.article_id =
>l.article_to;
>Then truncate links, copy the data from links_new. Alternatively, you
>could drop links, rename links_new to links, and recreate the
>constraints.
>I guess the real question is application design. Why doesn't this app
>store size at runtime instead of having to batch this huge update?
This is a link analysis application, I need to materialize all the sizes for target
articles in order to have the runtime part (vs. the loading part) run efficiently. I.e.
I really want to avoid a join with the articles table at runtime.
I have solved the size problem by other means (I compute it in my loader), but
I still have one query that needs to update a pretty large percentage of the links table...
I have previously used mysql, and for some reason I didn't have a problem with queries
like this (on the other hand mysql was crashing when building an index on article_to in the
links relation, so I had to work without a critical index)...
Thank!
Pavel
--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah.harris@xxxxxxxxxxxxxxxx
Edison, NJ 08837 | http://www.enterprisedb.com/
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
> Left the query running for 10+ hours and had to kill it. I guess there
> really was no need to have lots of
> shared buffers (the hope was that postgresql will cache the whole table). I
> ended up doing this step inside
> the application as a pre-processing step. Can't have postgres running with
> different fsych options since this
> will be part of an "easy to install and run" app, that should just require a
> typical PosgreSQL installation.
>Is the size always different? If not, you could limit the updates:
>UPDATE links
> SET target_size = size
>FROM articles
>WHERE articles.article_id = links.article_to
> AND links.target_size != articles.size;
Ah, this sounds better for sure! But its probably as good as the scan with an index-scan subquery I was getting before...
>Since this is a huge operation, what about trying:
>CREATE TABLE links_new AS SELECT l.col1, l.col2, a.size as
>target_size, l.col4, ... FROM links l, articles a WHERE a.article_id =
>l.article_to;
>Then truncate links, copy the data from links_new. Alternatively, you
>could drop links, rename links_new to links, and recreate the
>constraints.
>I guess the real question is application design. Why doesn't this app
>store size at runtime instead of having to batch this huge update?
This is a link analysis application, I need to materialize all the sizes for target
articles in order to have the runtime part (vs. the loading part) run efficiently. I.e.
I really want to avoid a join with the articles table at runtime.
I have solved the size problem by other means (I compute it in my loader), but
I still have one query that needs to update a pretty large percentage of the links table...
I have previously used mysql, and for some reason I didn't have a problem with queries
like this (on the other hand mysql was crashing when building an index on article_to in the
links relation, so I had to work without a critical index)...
Thank!
Pavel
--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah.harris@xxxxxxxxxxxxxxxx
Edison, NJ 08837 | http://www.enterprisedb.com/
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com