2xCPU Six-Core AMD Opteron 2439 SE
64GB RAM
Raid controller Perc6 512MB cache NV
- 2 HD 146GB SAS 15Krpm RAID1 (SO Centos 5.4 y pg_xlog) (XFS no barriers)
- 6 HD 300GB SAS 15Krpm RAID10 (DB Postgres 8.3.9) (XFS no barriers)
Server B (new)
------------------
2xCPU 16 Core AMD Opteron 6282 SE
64GB RAM
Raid controller H700 1GB cache NV
- 2HD 74GB SAS 15Krpm RAID1 stripe 16k (SO Centos 6.2)
- 4HD 146GB SAS 15Krpm RAID10 stripe 16k XFS (pg_xlog) (ext4 bs 4096, no barriers)
Raid controller H800 1GB cache nv
- MD1200 12HD 300GB SAS 15Krpm RAID10 stripe 256k (DB Postgres 8.3.18) (ext4 bs 4096, stride 64, stripe-width 384, no barriers)
Postgres DB is the same in both servers. This DB has 170GB size with some tables partitioned by date with a trigger. In both shared_buffers, checkpoint_segments... settings are similar because RAM is similar.
I supposed that, new server had to be faster than old, because have more disk in RAID10 and two RAID controllers with more cache memory, but really I'm not obtaining the expected results
For example this query:
EXPLAIN ANALYZE SELECT c.id AS c__id, c.fk_news_id AS c__fk_news_id, c.fk_news_group_id AS c__fk_news_group_id, c.fk_company_id AS c__fk_company_id, c.import_date AS c__import_date, c.highlight AS c__highlight, c.status AS c__status, c.ord AS c__ord, c.news_date AS c__news_date, c.fk_media_id AS c__fk_media_id, c.title AS c__title, c.search_title_idx AS c__search_title_idx, c.stored AS c__stored, c.tono AS c__tono, c.media_type AS c__media_type, c.fk_editions_news_id AS c__fk_editions_news_id, c.dossier_selected AS c__dossier_selected, c.update_stats AS c__update_stats, c.url_news AS c__url_news, c.url_image AS c__url_image,
m.id AS m__id,
m.name AS m__name, m.media_type AS m__media_type, m.media_code AS m__media_code, m.fk_data_source_id AS m__fk_data_source_id, m.language_iso AS m__language_iso, m.country_iso AS m__country_iso, m.region_iso AS m__region_iso, m.subregion_iso AS m__subregion_iso, m.media_code_temp AS m__media_code_temp, m.url AS m__url, m.current_rank AS m__current_rank, m.typologyid AS m__typologyid, m.fk_platform_id AS m__fk_platform_id, m.page_views_per_day AS m__page_views_per_day, m.audience AS m__audience, m.last_stats_update AS m__last_stats_update,
n.id AS n__id, n.fk_media_id AS n__fk_media_id, n.fk_news_media_id AS n__fk_news_media_id, n.fk_data_source_id AS n__fk_data_source_id, n.news_code AS n__news_code, n.title AS n__title, n.searchfull_idx AS n__searchfull_idx, n.news_date AS n__news_date, n.economical_value AS n__economical_value, n.audience AS n__audience, n.media_type AS n__media_type, n.url_news AS n__url_news, n.url_news_old AS n__url_news_old, n.url_image AS n__url_image, n.typologyid AS n__typologyid, n.author AS n__author, n.fk_platform_id AS n__fk_platform_id,
n2.id AS n2__id,
n2.name AS n2__name,
n3.id AS n3__id,
n3.name AS n3__name,
f.id AS f__id,
f.name AS f__name,
n4.id AS n4__id, n4.opentext AS n4__opentext,
i.id AS i__id,
i.name AS i__name, i.ord AS i__ord,
i2.id AS i2__id,
i2.name AS i2__name
FROM company_news_internet c
LEFT JOIN media_internet m ON c.fk_media_id =
m.id AND m.media_type = 4
LEFT JOIN news_internet n ON c.fk_news_id =
n.id AND n.media_type = 4
LEFT JOIN news_media_internet n2 ON n.fk_news_media_id =
n2.id AND n2.media_type = 4
LEFT JOIN news_group_internet n3 ON c.fk_news_group_id =
n3.id AND n3.media_type = 4
LEFT JOIN feed_internet f ON n3.fk_feed_id =
f.id LEFT JOIN news_text_internet n4 ON c.fk_news_id = n4.fk_news_id AND n4.media_type = 4
LEFT JOIN internet_typology i ON n.typologyid =
i.id LEFT JOIN internet_media_platform i2 ON n.fk_platform_id =
i2.id WHERE (c.fk_company_id = '16073' AND c.status <> '-3' AND n3.fk_feed_id = '30693' AND n3.status = '1' AND f.fk_company_id = '16073') AND n.typologyid IN ('6', '7', '1', '2', '3', '5', '4') AND
c.id > '49764393' AND c.news_date >= '2012-04-02'::timestamp -
INTERVAL '4 months' AND n.news_date >= '2012-04-02'::timestamp -
INTERVAL '4 months' AND c.fk_news_group_id IN ('43475') AND (c.media_type = 4) ORDER BY c.news_date DESC,
c.id DESC LIMIT 200
Takes about 20 second in server A but in new server B takes 150 seconds... In EXPLAIN I have noticed that sequential scan on table news_internet_201112 takes 2s:
-> Seq Scan on news_internet_201112 n (cost=0.00..119749.12 rows=1406528 width=535) (actual time=0.046..2186.379 rows=1844831 loops=1)
Filter: ((news_date >= '2011-12-02 00:00:00'::timestamp without time zone) AND (media_type = 4) AND (typologyid = ANY ('{6,7,1,2,3,5,4}'::integer[])))
While in Server B, takes 11s:
-> Seq Scan on news_internet_201112 n (cost=0.00..119520.12 rows=1405093 width=482) (actual time=0.177..11783.621 rows=1844831 loops=1)
Filter: ((news_date >= '2011-12-02 00:00:00'::timestamp without time zone) AND (media_type = 4) AND (typologyid = ANY ('{6,7,1,2,3,5,4}'::integer[])))
Is notorious that, while in server A, execution time vary only few second when I execute the same query repeated times, in server B, execution time fluctuates between 30 and 150 second despite the server dont have any client.
In other example, when I query entire table, running twice the same query:
Server 1
------------
EXPLAIN ANALYZE SELECT * from company_news_internet_201111 ;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on company_news_internet_201111 (cost=0.00..457010.37 rows=6731337 width=318) (actual time=0.042..19665.155 rows=6731337 loops=1)
Total runtime: 20391.555 ms
-
EXPLAIN ANALYZE SELECT * from company_news_internet_201111 ;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on company_news_internet_201111 (cost=0.00..457010.37 rows=6731337 width=318) (actual time=0.012..2171.181 rows=6731337 loops=1)
Total runtime: 2831.028 ms
Server 2
------------
EXPLAIN ANALYZE SELECT * from company_news_internet_201111 ;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on company_news_internet_201111 (cost=0.00..369577.79 rows=6765779 width=323) (actual time=0.110..10010.443 rows=6765779 loops=1)
Total runtime: 11552.818 ms
-
EXPLAIN ANALYZE SELECT * from company_news_internet_201111 ;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on company_news_internet_201111 (cost=0.00..369577.79 rows=6765779 width=323) (actual time=0.023..8173.801 rows=6765779 loops=1)
Total runtime: 12939.717 ms
It seems that Server B don cache the table¿?¿?
I'm lost, I had tested different file systems, like XFS, stripe sizes... but I not have had results
Any ideas that could be happen?
Thanks a lot!!
--
César Martín Pérez
cmartinp@xxxxxxxxx