Guidance Requested - Bulk Inserting + Queries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Experts,

Quick Summary: data can now be inserted very quickly via COPY + removing
indexes, but is there a design or some tricks to still allow someone to
query while the partition is still active and 'hot' ?

- Postgres 9.1
- Windows 7 (64-bit) , although this is just for the current test and
could vary depending on situation
- We have 4 main tables with daily partitions
- Each table/partition has multiple indexes on it
- Streaming logs from client machines into our server app which
processes the logs and tries to shove all that data into these daily
partitions as fast as it can. 
- Using COPY and removed original primary key unique constraints to try
to get it to be as fast as possible (some duplicates are possible)
- Will remove duplicates in a later step (disregard for this post)

We now found (thanks Andres and Snow-Man in #postgresql) that in our
tests, after the indexes get too large performance drops signficantly
and our system limps forward due to  disk reads (presumably for the
indexes).  If we remove the indexes, performance for our entire sample
test is great and everything is written to postgresql very quickly. 
This allows us to shove lots and lots of data in (for production
possibly 100 GB or a TB per day!)

My question is, what possible routes can I take where we can have both
fast inserts (with indexes removed until the end of the day), but still
allow a user to query against today's data? Is this even possible?  One
idea would be possibly have hourly tables for today and as soon as we
can try to re-add indexes.  Another possible solution might be to stream
the data to another "reader" postgres instance that has indexes,
although I'm not very versed in replication.


Any ideas would be greatly appreciated.

Thanks!

Ben


-- 
Benjamin Johnson 
http://getcarbonblack.com/ | @getcarbonblack
cell: 312.933.3612 | @chicagoben


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux