Re: Configuration Advice

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Steve wrote:
SO ... our goal here is to make this load process take less time. It seems the big part is building the big summary table; this big summary table is currently 9 million rows big. Every night, we drop the table, re-create it, build the 9 million rows of data (we use COPY to put hte data in when it's prepared, not INSERT), and then build the indexes on it -- of which there are many.

Would it be possible to just update the summary table, instead of recreating it from scratch every night?

Unfortunately this table gets queried in a lot of different ways and needs these indexes; also unfortunately, we have operator class indexes to support both ASC and DESC sorting on columns so these are for all intents and purposes duplicate but required under Postgres 8.1 (we've recently upgraded to Postgres 8.2, is this still a requirement?)

I don't think this has changed in 8.2.

Building these indexes takes forever! It's a long grind through inserts and then building the indexes takes a hefty amount of time too. (about 9 hours). Now, the application is likely part at fault, and we're working to make it more efficient, but it has nothing to do with the index building time. I'm wondering what we can do to make this better if anything; would it be better to leave the indexes on? It doesn't seem to be. Would it be better to use INSERTs instead of copies? Doesn't seem to be.

Would it help if you created multiple indexes simultaneously? You have enough CPU to do it. Is the index creation CPU or I/O bound? 9 million rows should fit in 16 GB of memory, right?

- The load process itself takes about 6 gigs of memory, the rest is free for postgres because this is basically all the machine does.

Can you describe the load process in more detail? What's it doing with the 6 gigs?

- If this was your machine and situation, how would you lay out the emmory settings? What would you set the FSM to?

FSM seems irrelevant here..

Do wal_buffers/full_page_writes matter of FSYNC is off?

Better turn off full_page_writes, since you can kiss goodbye to data integrity anyway with fsync=off.

Anyway... any advice would be appreciated :)

What's your maintenance_work_mem setting? It can make a big difference in sorting the data for indexes.

If you could post the schema including the indexes, people might have more ideas...

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux