Re: Postgres performance comparing GCP and AWS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> I expect my postgres on GPC to be at least similar to the one managed by AWS RDS

imho:
-  on Google Cloud you can test with  "Cloud SQL for Postgresql" ( https://cloud.google.com/sql/docs/postgres )
-  on Google Compute Engine ( VM ):  you have to tune the disks ; linux ; file system ; scheduler ; 
         and it is a complex task

imho:  select the perfect disk types for the postgresql data  ( and create a fast RAID )
https://cloud.google.com/compute/docs/disks
Compute Engine offers several types of storage options for your instances. Each of the following storage options has unique price and performance characteristics:
- Zonal persistent disk: Efficient, reliable block storage.
- Regional persistent disk: Regional block storage replicated in two zones.
- Local SSD: High performance, transient, local block storage.
- Cloud Storage buckets: Affordable object storage.
- Filestore: High performance file storage for Google Cloud users.

regards,
 Imre


Maurici Meneghetti <maurici.meneghetti@xxxxxxxxxxxxxxxxxxxx> ezt írta (időpont: 2021. febr. 23., K, 23:14):
Hi everyone,

I have 2 postgres instances created from the same dump (backup), one on a GCP VM and the other on AWS RDS. The first instance takes 18 minutes and the second one takes less than 20s to run this simples query:
SELECT "Id", "DateTime", "SignalRegisterId", "Raw" FROM "SignalRecordsBlobs" WHERE "SignalSettingId" = 103 AND "DateTime" BETWEEN '2019-11-28T14:00:12.540200000' AND '2020-07-23T21:12:32.249000000';
I’ve run this query a few times to make sure both should be reading data from cache.
I expect my postgres on GPC to be at least similar to the one managed by AWS RDS so that I can work on improvements parallelly and compare.

DETAILS:
Query explain for Postgres on GCP VM:
Bitmap Heap Scan on SignalRecordsBlobs SignalRecordsBlobs  (cost=18.80..2480.65 rows=799 width=70) (actual time=216.766..776.032 rows=5122 loops=1)
    Filter: (("DateTime" >= \'2019-11-28 14:00:12.5402\'::timestamp without time zone) AND ("DateTime" <= \'2020-07-23 21:12:32.249\'::timestamp without time zone))
    Heap Blocks: exact=5223
    Buffers: shared hit=423 read=4821
  ->  Bitmap Index Scan on IDX_SignalRecordsBlobs_SignalSettingId  (cost=0.00..18.61 rows=824 width=0) (actual time=109.000..109.001 rows=5228 loops=1)
          Index Cond: ("SignalSettingId" = 103)
          Buffers: shared hit=3 read=18
Planning time: 456.315 ms
Execution time: 776.976 ms

Query explain for Postgres on AWS RDS:
Bitmap Heap Scan on SignalRecordsBlobs SignalRecordsBlobs  (cost=190.02..13204.28 rows=6213 width=69) (actual time=2.215..14.505 rows=5122 loops=1)
    Filter: (("DateTime" >= \'2019-11-28 14:00:12.5402\'::timestamp without time zone) AND ("DateTime" <= \'2020-07-23 21:12:32.249\'::timestamp without time zone))
    Heap Blocks: exact=5209
    Buffers: shared hit=3290 read=1948
  ->  Bitmap Index Scan on IDX_SignalRecordsBlobs_SignalSettingId  (cost=0.00..188.46 rows=6405 width=0) (actual time=1.159..1.159 rows=5228 loops=1)
          Index Cond: ("SignalSettingId" = 103)
          Buffers: shared hit=3 read=26
Planning time: 0.407 ms
Execution time: 14.87 ms

PostgreSQL version number running:
• VM on GCP
: PostgreSQL 11.10 (Debian 11.10-0+deb10u1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
• Managed by RDS on AWS: PostgreSQL 11.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit

How PostgreSQL was installed:
• VM on GCP
: Already installed when created VM running Debian on Google Console.
• Managed by RDS on AWS: RDS managed the installation.

Changes made to the settings in the postgresql.conf file:
Here are some postgres parameters that might be useful:
Instance on VM on GCP (2 vCPUs, 2 GB memory, 800 GB disk):
• effective_cache_size: 1496MB
• maintenance_work_mem: 255462kB (close to 249MB)
• max_wal_size: 1GB
• min_wal_size: 512MB
• shared_buffers: 510920kB (close to 499MB)
• max_locks_per_transaction 1000
• wal_buffers: 15320kB (close to 15MB)
• work_mem: 2554kB
• effective_io_concurrency: 200
• dynamic_shared_memory_type: posix
On this instance we installed a postgres extension called timescaledb to gain performance on other tables. Some of these parameters were set using recommendations from that extension.

Instance managed by RDS (2 vCPUs, 2 GiB RAM, 250GB disk, 750 de IOPS):
• effective_cache_size: 1887792kB (close to 1844MB)
• maintenance_work_mem: 64MB
• max_wal_size: 2GB
• min_wal_size: 192MB
• shared_buffers: 943896kB (close to 922MB)
• max_locks_per_transaction 64

Operating system and version by runing "uname -a":
• VM on GCP:
Linux {{{my instance name}}} 4.19.0-14-cloud-amd64 #1 SMP Debian 4.19.171-2 (2021-01-30) x86_64 GNU/Linux
• Managed by AWS RDS: Aparently Red Hay as shown using SELECT version();

Program used to connect to PostgreSQL: Python psycopg2.connect() to create the connection and pandas read_sql_query() to query using that connection.

Thanks in advance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux