Re: High Frequency Inserts to Postgres Database vs Writing to a File

Greg Smith <gsmith@xxxxxxxxxxxxxxx> · Tue, 03 Nov 2009 23:42:48 -0500

Jay Manni wrote:
The data in the stream is in the form of variable length records with 
clearly defined fields ? so it can be stored in a database or in a 
file. The only caveat is that the rate of records coming in the stream 
could be several 1000 records a second.
There's a few limits to be concerned about here, some physical, some 
related to your application design.  A few thousand records is possible 
with either the right software design or some hardware assist.  The wall 
where it gets increasingly difficult to keep up is closer to 10K/second, 
presuming your records aren't particularly wide.

Some background:  when you commit a record in PostgreSQL, by default 
that transaction doesn't complete until data has been physically written 
to disk.  If you take a typical 7200 RPM disk, that spins 120 
times/second, meaning that even under the best possible conditions there 
can only be 120 commits per second to physical disk.

However, that physical commit can contain more than one record.  Here 
are the common ways to increase the number of records you can insert per 
second:

1) Batch up inserts.  Turn off any auto-commit behavior in your client, 
insert a bunch of records, issue one COMMIT. 
Typical popular batch sizes are in the 100-1000 records/commit range.  
If individual records aren't very wide, you can easily get a huge 
speedup here.  Hard to estimate how much this will help in your case 
without knowing more about that width and the speed of your underlying 
disks; more on that below.

2) Have multiple clients committing at once.  Typically I see this give 
at most about a 5X speedup, so on a slow disk with single record commits 
you might hit 600/s instead of 120/s if you had 10 clients going at once.

3) Use a RAID controller with a battery-backed cache.  This will hold 
multiple disk commits in its cache and dump them onto disk in larger 
chunks transparently, with only a small risk of corruption if there's an 
extended power outage longer than the battery lasts.  Typically I'll see 
this increase commit rate to the 1000-10,000 commits/second range, again 
depending on underlying disk speed and row size.  This approach really 
reduces the worst-case behavior disks can get into, which is where you 
keep seeking between two spots writing small bits at each one.

4) Turn off synchronous_commit.  This lets you adjust the rate at which 
records get committed into larger chunks without touching your 
application or hardware.  It does introduce the possibility you might 
lose some records if there's a crash in the middle of loading or 
changing things.  Adjusting the commit period here upwards makes this 
case look similar to (1), you're basically committing in larger chunks 
but the app just doesn't know it.

Basically, (2) alone is probably not enough to reach 1,000 per second.  
But (1) or (3) is, as is (4) if you can take the potential data 
integrity issues if there's a crash.  If your batchs get bigger via any 
of these techniques, what should end up happening is that you push the 
bottleneck to somewhere else, like disk write or seek speed.  Which of 
those you'll run into depends on how interleaved these writes are with 
application reads and the total disk bandwidth.

To close, here's a quick example showing the sort of analysis you should 
be doing to better estimate here.  Imagine you're trying to write 10,000 
records/second.  Each record is 100 bytes wide.  That works out to be 
almost 1MB/s of steady disk writes.  In the real world, a single cheap 
disk can't do much better than this if those writes involve heavy 
seeking around the disk.  And that's happens in a database, because at a 
minimum you need to write to both the main database area and the 
write-ahead log.  If your records are 1,000 records wide instead, you 
might hit the upper limit of your disk seeking capability at only 
1,000/second.

Whereas if you have an app that's just writing to a file, you wouldn't 
necessarily expect that to regularly seek elsewhere.  That means it's 
much likely that you'd hit >10MB/s on writes rather than the 1-2MB/s 
worst-case behavior when seeking.  Of course, as you've already noted, 
you end up paying penalties on reporting instead if you do that.  The 
best you can do here is to try and measure your application and 
estimate/simulate larger volume, then see what happens if you apply one 
or more of these techniques.

--
Greg Smith        greg@xxxxxxxxxxxxxxx        Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance