> I have an application wherein a process needs to read data from a stream and > store the records for further analysis and reporting. The data in the stream > is in the form of variable length records with clearly defined fields – so > it can be stored in a database or in a file. The only caveat is that the > rate of records coming in the stream could be several 1000 records a second. > The design choice I am faced with currently is whether to use a postgres > database or a flat file for this purpose. My application already maintains a > postgres (8.3.4) database for other reasons – so it seemed like the > straightforward thing to do. However I am concerned about the performance > overhead of writing several 1000 records a second to the database. The same > database is being used simultaneously for other activities as well and I do > not want those to be adversely affected by this operation (especially the > query times). The advantage of running complex queries to mine the data in > various different ways is very appealing but the performance concerns are > making me wonder if just using a flat file to store the data would be a > better approach. > > > > Anybody have any experience in high frequency writes to a postgres database? As mentioned earlier in this thread,,make sure your hardware can scale. You may hit a "monolithic hardware" wall and may have to distribute your data across multiple boxes and have your application manage the distribution and access. A RAID 10 storage architecture(since fast writes are critical) with a mulitple core box (preferably 8) having fast scsi disks (15K rpm) may be a good starting point. We have a similar requirement and we scale by distributing the data across multiple boxes. This is key. If you need to run complex queries..plan on aggregation strategies (processes that aggregate and optimize the data storage to facilitate faster access). Partitioning is key. You will need to purge old data at some point. Without partitions..you will run into trouble with the time taken to delete old data as well as availability of disk space. These are just guidelines for a big warehouse style database. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance