My server is oddly very slow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have 2 servers 1 for development (called dev) 1 for production (called prod). The server for development is very basic (a dual core 2.1ghtz on a basic 300MB hard drive with 2GB of memory (this server is not dedicated to postgres, it also uses php, mapserver and so on).

I assigned the memory as follow
max co 40
shared buffers 258MB
work_mem 8MB
maitainance_work_mem: 32MB
wal_buffers 8MB
check_point_seg 16
autovacuum on
effective-cache-size 744MB

Dev is installed on centos 4 32bits (on a virtual server openvz)

My production server is a QUAD CORE XEON E5310 (1,6Ghtz 3GB of RAM and the hard drive is a very fast SAN with fiberchanel: the copy of files on the SAN is about 3 times faster than on the development hard drive. This is a fully dedicated server for postgres.

I assigned the memory as follow
max co 40
shared buffers 512MB
work_mem 16MB
maitainance_work_mem: 128MB
wal_buffers 8MB
check_point_seg 16
autovacuum on
effective-cache-size 1536MB

prod is installed on Red Hat Enterprise Linux ES release 4 (Nahant Update 6) 32bits (not on a virtual server) Those 2 distribution are very similar, and I use the same proprams RPM on both systems

The version of both databases is postgres 8.4.1, I installed the same database (in fact prod comes form a dump of dev once every week) The dump is very big about 16GB compressed. Once it is dumped, I make a full analyze on the database before I use it.

Now according to the hardware, I would suspect dev should be slower than prod, but almose all queries are slower on prod, some times 10 times slower.

Today I made a query with an explain analyse (same query on both servers) and I got for dev:


  1.
     "Limit  (cost=396689.94..399801.92 rows=2000 width=512) (actual
     time=8696.467..10028.179 rows=2000 loops=1)"
  2.
     "  ->  GroupAggregate  (cost=396689.94..424470.56 rows=17854
     width=512) (actual time=8696.463..10024.632 rows=2000 loops=1)"
  3.
     "        Filter: (sum(effectif_max) >= 10)"
  4.
     "        ->  Sort  (cost=396689.94..397088.34 rows=159360
     width=512) (actual time=8693.396..8745.436 rows=8344 loops=1)"
  5.
     "              Sort Key: implantation_company_id, virtual_building_id"
  6.
     "              Sort Method:  external merge  Disk: 125928kB"
  7.
     "              ->  Bitmap Heap Scan on gen_establishment_search
      (cost=8156.66..346425.20 rows=159360 width=512) (actual
     time=2545.813..6688.078 rows=152200 loops=1)"
  8.
     "                    Recheck Cond: (gis_departement_id =
     '75'::bpchar)"
  9.
     "                    Filter: (((telephone IS NOT NULL) OR
     (telephone_siege IS NOT NULL)) AND (NOT etablissement_masque) AND
     (ref_zone_permission_id = ANY ('{2,1}'::integer[])) AND
     (created_by_user_group = 1) AND (ref_establishment_type_id = ANY
     ('{1,2,3}'::integer[])))"
 10.
     "                    ->  Bitmap Index Scan on
     gen_establishment_search_gis_departement_id  (cost=0.00..8116.82
     rows=498680 width=0) (actual time=2435.952..2435.952 rows=502304
     loops=1)"
 11.
     "                          Index Cond: (gis_departement_id =
     '75'::bpchar)"
 12.
     "Total runtime: 10109.194 ms"


-> so 10 seconds






and for prod

  1.
     Limit  (cost=397787.02..400793.19 rows=2000 width=513) (actual
     time=94188.493..97457.042 rows=2000 loops=1)
  2.
        ->  GroupAggregate  (cost=397787.02..425288.95 rows=18297
     width=513) (actual time=94188.489..97452.268 rows=2000 loops=1)
  3.
              Filter: (sum(effectif_max) >= 10)
  4.
              ->  Sort  (cost=397787.02..398179.69 rows=157069
     width=513) (actual time=94187.660..94530.617 rows=8312 loops=1)
  5.
                    Sort Key: implantation_company_id, virtual_building_id
  6.
                    Sort Method:  external merge  Disk: 124192kB
  7.
                    ->  Bitmap Heap Scan on gen_establishment_search
      (cost=8231.64..347992.15 rows=157069 width=513) (actual
     time=836.994..91818.746 rows=149851 loops=1)
  8.
                          Recheck Cond: (gis_departement_id = '75'::bpchar)
  9.
                          Filter: (((telephone IS NOT NULL) OR
     (telephone_siege IS NOT NULL)) AND (NOT etablissement_masque) AND
     (ref_zone_permission_id = ANY ('{2,1}'::int
 10.
     eger[])) AND (created_by_user_group = 1) AND
     (ref_establishment_type_id = ANY ('{1,2,3}'::integer[])))
 11.
                          ->  Bitmap Index Scan on
     gen_establishment_search_gis_departement_id  (cost=0.00..8192.37
     rows=502886 width=0) (actual time=699.530..699.530 rows=
 12.
     507213 loops=1)
 13.
                                Index Cond: (gis_departement_id =
     '75'::bpchar)
 14.
      Total runtime: 97496.574 ms


-> 90 seconds

for some other queries is is worst. None of the queries I have tested are faster on prod.

So I checked the memory on prod during my query execution:
vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 1 1280 23320 6356 2975956 0 0 29 45 1 0 2 0 95 3

The swap is not used but something is a bit odd: the cache is using 98% of the RAM

if I do
SHOW shared_buffers;
I get

shared_buffers
----------------
512MB


as I would expect from my configuration.


I've got out of idea, any idea?


--

Logo_HBS_mail.jpg
  Adrien DUCOS
  Analyste développeur
  aducos@xxxxxxxxxxxxxxxx <mailto:aducos@xxxxxxxxxxxxxxxx>
  www.hbs-research.com <http://www.hbs-research.com/>
  +33 (0)9 70 44 64 10
  24 rue de l'Est
  75020 Paris




--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux