How to keep queries low latency as concurrency increases

Catalin Iacob <iacobcatalin@xxxxxxxxx> · Tue, 30 Oct 2012 01:11:31 +0100

As I increase concurrency I'm experiencing what I believe are too slow
queries given the minuscule amount of data in my tables.

I have 20 Django worker processes and use ab to generate 3000 requests
to a particular URL which is doing some read only queries. I ran this
with ab concurrency level set to 4, 12 and 20. With some aggregation
using pgbadger here are the results:

concurrency 4
Number of queries: 39,046
Total query duration: 4.255s
Slowest query: 33ms
Total taken to execute slowest query 6000 times: 1.633s
Number of queries taking over 100ms: 0
Number of queries taking over 50ms: 0
Number of queries taking over 25ms: 1
Number of queries taking over 10ms: 7

concurrency 12
Number of queries: 39,035
Total query duration: 7.435s
Slowest query: 174ms
Total taken to execute slowest query 6000 times: 2.617s
Number of queries taking over 100ms: 2
Number of queries taking over 50ms: 4
Number of queries taking over 25ms: 17
Number of queries taking over 10ms: 99

concurrency 20
Number of queries: 39,043
Total query duration: 11.614s
Slowest query: 198ms
Total taken to execute slowest query 6000 times: 4.286s
Number of queries taking over 100ms: 5
Number of queries taking over 50ms: 19
Number of queries taking over 25ms: 52
Number of queries taking over 10ms: 255

All tests have 0 INSERTs, 0 UPDATEs, 0 DELETEs, aprox. 18000 SELECTs
and 21000 OTHERs (Django's ORM sends a lot of SET TIME ZONE, SET
default_transaction_isolation TO 'READ committed'; etc)

The 3 queries that take longest in total are:
SELECT "django_site"."id", "django_site"."domain",
"django_site"."name", "vwf_customsite"."site_ptr_id",
"vwf_customsite"."geo_reference_id",
"vwf_customsite"."friendly_domain", "vwf_customsite"."ws_machine",
"vwf_customsite"."public", "vwf_customsite"."user_limit",
"vwf_customsite"."hidden_login_and_registration",
"vwf_customsite"."logo", "vwf_customsite"."LANGUAGE",
"vwf_customsite"."ga_tracker_id", "vwf_customsite"."always_running",
"vwf_customsite"."deleted", "vwf_customsite"."version",
"vwf_customsite"."contact_email" FROM "vwf_customsite" INNER JOIN
"django_site" ON ( "vwf_customsite"."site_ptr_id" = "django_site"."id"
) WHERE "vwf_customsite"."site_ptr_id" = 0;

SELECT "vwf_plugin"."id", "vwf_plugin"."name", "vwf_plugin"."site_id",
"vwf_plugin"."enabled" FROM "vwf_plugin" WHERE (
"vwf_plugin"."site_id" = 0 AND "vwf_plugin"."name" = '' ) ;

SELECT "django_site"."id", "django_site"."domain",
"django_site"."name" FROM "django_site" WHERE "django_site"."domain" =
'';

The tables are extremely small: django_site has 8 rows, vwf_customsite
has 7 and vwf_plugin 43. My intuition would say that for these read
only queries on tables this small no query should take more than 5 ms
even for a concurrency level of 20 and that performance shouldn't
degrade at all when going from 4 to 20 concurrent ab requests. The
CPUs are also used only about 10% so there should be plenty of
capacity for more concurrency.

The numbers above show a different situation though. The average for
the slowest query stays under 1ms but it grows when increasing
concurrency and there are spikes that really take too long IMO.

Am I right that it should be possible to do better and if so how?
Thanks a lot for any ideas or insights!

More details about my setup:

The schemas:
                                 Table "public.django_site"
 Column |          Type          |                        Modifiers
--------+------------------------+----------------------------------------------------------
 id     | integer                | not null default
nextval('django_site_id_seq'::regclass)
 domain | character varying(100) | not null
 name   | character varying(50)  | not null
Indexes:
    "django_site_pkey" PRIMARY KEY, btree (id)
Referenced by:
<snip list of 25 tables>

                   Table "public.vwf_customsite"
            Column             |          Type          | Modifiers
-------------------------------+------------------------+-----------
 site_ptr_id                   | integer                | not null
 geo_reference_id              | integer                |
 friendly_domain               | character varying(100) | not null
 public                        | boolean                | not null
 logo                          | character varying(100) |
 language                      | character varying(2)   | not null
 ga_tracker_id                 | character varying(16)  | not null
 version                       | character varying(100) | not null
 contact_email                 | character varying(254) | not null
 always_running                | boolean                | not null
 deleted                       | boolean                | not null
 ws_machine                    | character varying(100) | not null
 user_limit                    | integer                | not null
 hidden_login_and_registration | boolean                | not null
Indexes:
    "vwf_customsite_pkey" PRIMARY KEY, btree (site_ptr_id)
    "vwf_customsite_geo_reference_id" btree (geo_reference_id)
Foreign-key constraints:
    "geo_reference_id_refs_id_488579c58f2d1a89" FOREIGN KEY
(geo_reference_id) REFERENCES geo_reference_georeference(id)
DEFERRABLE INITIALLY DEFERRED
    "site_ptr_id_refs_id_712ff223c9517f55" FOREIGN KEY (site_ptr_id)
REFERENCES django_site(id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
<snip list of 1 table>

                                 Table "public.vwf_plugin"
 Column  |          Type          |                        Modifiers
---------+------------------------+---------------------------------------------------------
 id      | integer                | not null default
nextval('vwf_plugin_id_seq'::regclass)
 name    | character varying(255) | not null
 site_id | integer                | not null
 enabled | boolean                | not null default false
Indexes:
    "vwf_plugin_pkey" PRIMARY KEY, btree (id)
    "vwf_plugin_site_id" btree (site_id)
Foreign-key constraints:
    "site_id_refs_id_4ac2846d79527bae" FOREIGN KEY (site_id)
REFERENCES django_site(id) DEFERRABLE INITIALLY DEFERRED

Hardware:
Virtual machine running on top of VMWare
4 cores, Intel(R) Xeon(R) CPU           E5645  @ 2.40GHz
4GB of RAM

Disk that is virtual enough that I have no idea what it is, I know
that there's some big storage shared between multiple virtual
machines. Filesystem is ext4 with default mount options. I can imagine
IO performance is not great for this machine, however, for the
readonly queries and the very small tables above I would expect
everything to be cached in memory and the disk not to matter.

Ubuntu 12.04 with Postgres installed from Ubuntu's packages

pgbouncer 1.4.2 installed from Ubuntu's packages on the same machine
as Postgres. Django connects via TCP/IP to pgbouncer (it does one
connection and one transaction per request) and pgbouncer keeps
connections open to Postgres via Unix socket. The Python client is
self compiled psycopg2-2.4.5.

uname -a
Linux wcea014.virtuocity.eu 3.2.0-32-generic #51-Ubuntu SMP Wed Sep 26
21:33:09 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Non default settings
            name            |
    current_setting
----------------------------+------------------------------------------------------------------------------------------------------------
 version                    | PostgreSQL 9.1.6 on
x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro
4.6.3-1ubuntu5) 4.6.3, 64-bit
 client_encoding            | UTF8
 effective_cache_size       | 1000MB
 external_pid_file          | /var/run/postgresql/9.1-main.pid
 lc_collate                 | en_US.UTF-8
 lc_ctype                   | en_US.UTF-8
 log_checkpoints            | on
 log_connections            | on
 log_destination            | stderr
 log_directory              | /var/log/postgresql
 log_disconnections         | on
 log_filename               | postgresql-%Y-%m-%d-concTODO.log
 log_line_prefix            | %t [%p]: [%l-1]
 log_lock_waits             | on
 log_min_duration_statement | 0
 log_rotation_size          | 0
 log_temp_files             | 0
 logging_collector          | on
 maintenance_work_mem       | 400MB
 max_connections            | 100
 max_stack_depth            | 2MB
 port                       | 2345
 random_page_cost           | 2
 server_encoding            | UTF8
 shared_buffers             | 800MB
 ssl                        | on
 TimeZone                   | localtime
 unix_socket_directory      | /var/run/postgresql
 wal_buffers                | 16MB
 work_mem                   | 10MB

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance