Re: PostgreSQL, clusters and load-balance

Shane Ambler <pgsql@xxxxxxxxxx> · Wed, 26 Mar 2008 10:56:38 +1030

Bill Wordsworth wrote:
On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@xxxxxxx> wrote:

Bill Wordsworth wrote on 25.03.2008 19:16:
When traffic goes up, my webserver creates multiple instances of
postgresql.exe. At some basic level, aren't they similar to Oracle's RAC
"clusters", except that they are not aware of each other?
No, absolutely not. Each client request is handled by a single postgres
process
which is spawned by the postmaster upon connection.

Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway, is
this spawning being done by postmaster or webserver or both? If postmaster,
does an application-level persistent connection request communicate itself
directly to the postmaster, and can the postmaster keep track of its
spawning?

In simplified terms - you have one backend postgres process that handles 
the data storage and caching etc. Then you have one postgres process 
running for each client connected to the server at any given time. This 
client process handles all requests to and from the client and talks to 
the backend process to get the data required for the request. You will 
have one postgres client connection running for each concurrent db 
connection required by the web server.

With the scripting used for building your web pages - each time you open 
a connection you start a postgres client process running as you have 
seen happen. Then when you close the connection the client process for 
that will finish.

If you are using persistant connections - then when you close a 
connection the web server will keep the client process running and use 
it again for the next new connection saving time in starting the process up.

Also, at some crude level, if I were to direct every alternate connection to
a different install box of postgresql, won't that help with *some*
load-balance?
Cheers, Bill

All of these postgres processes will be running on the one machine - 
this may be the same machine as the web server or a separate one. You 
can use replication to store the same data on more than one server and 
use all of them for responding to selects for the web server.

Most replication options go for only using one of these servers for 
updates and the others for selects only. You can then use pooling 
options such as pgpool (or code it into your scripting if you wish) to 
distribute your connection requests between these replicated servers.

--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general