Re: scale up (postgresql vs mssql)

Andrew Dunstan <andrew@xxxxxxxxxxxx> · Fri, 13 Apr 2012 10:32:11 -0400

On 04/11/2012 06:11 PM, Eyal Wilde wrote:
hi,

i had a stored procedure in ms-sql server. this stored procedure gets 
a parameter (account-id), dose about 20 queries, fills some temporary 
tables, and finally, returns a few result-sets. this stored procedure 
converted to stored function in postgresql (9.1). the result-sets are 
being returned using refcursors. this stored function is logically, 
almost identical to the ms-sql stored procedure.  a LOT of work had 
been done to make postgresql getting close to ms-sql speed (preparing 
temp-tables in advance, using "analyze" in special places inside the 
stored function in order to hint the optimizer that the temp-tables 
have very few records, thus eliminating unnecessary and expansive 
hash-join, and a lot more..). after all that, the stored function is 
running in a reasonable speed (normally ~60 milliseconds).

now, i run a test that simulates 20 simultaneous clients, asking for 
"account-id" randomly. once a client get a result, it immediately asks 
for another one. the test last 5 seconds.  i use a connection pool 
(with Tomcat web-server). the pool is automatically increased to ~20 
connections (as expected). the result is postgresql dose ~60 
"account-id"s, whereas ms-sql dose ~330 "account-id"s. postgresql 
shows that each "account-id" took about 400-1000 msec ,which is so 
much slower than the ~60 msec of a single execution.

in a single execution postgresql may be less the twice slower than 
ms-sql, but in 20 simultaneous clients, it's about 6 times worse. why 
is that?

the hardware is one 4-core xeon. 8GB of ram. the database size is just 
a few GB's. centos-6.2.

do you think the fact that postgresql use a process per connection 
(instead of multi-threading) is inherently a weakness of postgrsql, 
regarding scale-up?
would it be better to limit the number of connections to something 
like 4, so that executions don't interrupt each other?

thanks in advance for any help!

I doubt that the process-per-connection has much effect, especially on 
Linux where process creation is extremely cheap, and you're using a 
connection pooler anyway. The server is pretty modest, though. If you 
can add enough RAM that you can fit the whole db into Postgres shared 
buffers you might find things run a whole lot better. You should show us 
your memory settings, among other things - especially shared_buffers, 
temp_buffers and work_mem.

cheers

andrew

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance