2006/12/27, lopezf@xxxxxxxxxxxxx <lopezf@xxxxxxxxxxxxx>:
Hi everybody, I'm looking for a database system for a SCADA system. The major probles I think it's on performance because the application it's going to poll about 4k variables per second from hardware and has to register the values on the procces table. I heard that PostgreSQL provides a bulk loading mechanism called COPY, which takes tab-delimited or CSV input from a file. Where COPY can be used instead of hundreds or thousands of INSERTS, it can cut execution time. I'm less than a novice so I'll thank any piece of advice.
I believe you could easily simulate the load in a small fake-SCADA-program and see how the hardware at your disposal handles it with postgresql, a different RDBMS or simply a flat file. Make a small program which will generate a set of 4k random values and send them asynchronously over the network to your data acquisition application which should store the data in the database. Measure how fast you can send the data and still record everything. If data acquisition speed is your primary concern (as it seems to be), you might want to use a simple .csv file: you'll probably beat the performance of any database management system. You could periodically move the saved data from the .csv files into a database (say, postgresql) where you could (I assume) analyze it. You might want to use a separate machine for the database management system so as to remove any unnecessary CPU and I/O disturbances from the primary data storage machine. I don't think your load (32 kBps if your variables are double precision float values) is a challenge, but running any kind of analysis on a basically real-time-response-constrained machine might cost you data losses and I don't know if you can afford those. Cheers, t.n.a.