On Wed, Nov 30, 2011 at 7:27 AM, Benjamin Johnson <benjamin.johnson@xxxxxxxxxxxxxxxxxx> wrote: > Experts, > > Quick Summary: data can now be inserted very quickly via COPY + removing > indexes, but is there a design or some tricks to still allow someone to > query while the partition is still active and 'hot' ? > > - Postgres 9.1 > - Windows 7 (64-bit) , although this is just for the current test and > could vary depending on situation > - We have 4 main tables with daily partitions How long are the daily partitions kept for? > - Each table/partition has multiple indexes on it > - Streaming logs from client machines into our server app which > processes the logs and tries to shove all that data into these daily > partitions as fast as it can. Why shove it in as fast as you can? If you want to both read and write at the same time, then focusing first only on writing and worrying about reading as an after thought seems like the wrong thing to do. > - Using COPY and removed original primary key unique constraints to try > to get it to be as fast as possible (some duplicates are possible) > - Will remove duplicates in a later step (disregard for this post) > > We now found (thanks Andres and Snow-Man in #postgresql) that in our > tests, after the indexes get too large performance drops signficantly > and our system limps forward due to disk reads (presumably for the > indexes). How many hours worth of data can be loaded into the new partition before the performance knee hits? After the knee, how does the random disk read activity you see compare to the maximum random disk reads your IO system can support? How many COPYs were you doing at the same time? During this test, was there background select activity going on, or was the system only used for COPY? > If we remove the indexes, performance for our entire sample > test is great and everything is written to postgresql very quickly. > This allows us to shove lots and lots of data in (for production > possibly 100 GB or a TB per day!) How much do you need to shove in per day? If you need to insert it, and index it, and run queries, and deal with maintenance of the older partitions, then you will need a lot of spare capacity, relative to just inserting, to do all of those things. Do you have windows where there is less insert activity in which other things can get done? Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance