Re: High update activity, PostgreSQL vs BigDBMS

Dave Cramer <pg@xxxxxxxxxxxxx> · Thu, 28 Dec 2006 07:38:04 -0500

Guy,

Did you tune postgresql ? How much memory does the box have? Have you  
tuned postgresql ?

Dave
On 28-Dec-06, at 12:46 AM, Guy Rouillier wrote:

I don't want to violate any license agreement by discussing  
performance, so I'll refer to a large, commercial PostgreSQL- 
compatible DBMS only as BigDBMS here.

I'm trying to convince my employer to replace BigDBMS with  
PostgreSQL for at least some of our Java applications.  As a proof  
of concept, I started with a high-volume (but conceptually simple)  
network data collection application.  This application collects  
files of 5-minute usage statistics from our network devices, and  
stores a raw form of these stats into one table and a normalized  
form into a second table. We are currently storing about 12 million  
rows a day in the normalized table, and each month we start new  
tables.  For the normalized data, the app inserts rows initialized  
to zero for the entire current day first thing in the morning, then  
throughout the day as stats are received, executes updates against  
existing rows.  So the app has very high update activity.

In my test environment, I have a dual-x86 Linux platform running  
the application, and an old 4-CPU Sun Enterprise 4500 running  
BigDBMS and PostgreSQL 8.2.0 (only one at a time.)  The Sun box has  
4 disk arrays attached, each with 12 SCSI hard disks (a D1000 and 3  
A1000, for those familiar with these devices.)  The arrays are set  
up with RAID5.  So I'm working with a consistent hardware platform  
for this comparison.  I'm only processing a small subset of files  
(144.)

BigDBMS processed this set of data in 20000 seconds, with all  
foreign keys in place.  With all foreign keys in place, PG took  
54000 seconds to complete the same job.  I've tried various  
approaches to autovacuum (none, 30-seconds) and it doesn't seem to  
make much difference.  What does seem to make a difference is  
eliminating all the foreign keys; in that configuration, PG takes  
about 30000 seconds.  Better, but BigDBMS still has it beat  
significantly.

I've got PG configured so that that the system database is on disk  
array 2, as are the transaction log files.  The default table space  
for the test database is disk array 3.  I've got all the reference  
tables (the tables to which the foreign keys in the stats tables  
refer) on this array.  I also store the stats tables on this  
array.  Finally, I put the indexes for the stats tables on disk  
array 4.  I don't use disk array 1 because I believe it is a  
software array.

I'm out of ideas how to improve this picture any further.  I'd  
appreciate some suggestions.  Thanks.

--
Guy Rouillier

---------------------------(end of  
broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings