Re: Hosting PG on AWS in 2013

Tomas Vondra <tv@xxxxxxxx> · Sun, 07 Apr 2013 17:38:04 +0200

Hi David,

On 7.4.2013 03:51, David Boreham wrote:
> 
> First I need to say that I'm asking this question on behalf of "a
> friend", who asked me what I thought on the subject -- I host all the
> databases important to me and my livelihood, on physical machines I own
> outright. That said, I'm curious as to the current thinking on a)
> whether it is wise, and b) if so how to deploy, PG servers on AWS. As I
> recall, a couple years ago it just wasn't a wise plan because Amazon's
> I/O performance and reliability wasn't acceptable. Perhaps that's no
> longer the case..

That depends on what you mean by reliability and (poor) performance.

Amazon says the AFR for EBS is 0.1-0.5% (under some conditions, see
http://aws.amazon.com/ebs/). I have no reason not to trust them in this
case. Maybe it was much worse a few years ago, but I haven't been
working with AWS back then so I can't compare.

As for the performance, AFAIK the EBS volumes always had, and probably
will have, a 32 MB/s limit. Thanks to caching, built into the EBS, the
performance may seem much better initially (say twice as good), but
after a sustained write workload (say 15-30 minutes), you're back at the
32 MB/s per volume.

The main problem with regular EBS is the variability - the numbers above
are for cases where everything operates fine. When something goes wrong,
you can get 1 MB/s for a period of time. And when you create 10 volumes,
each will have a bit different performance.

There are ways to handle this, though - the "old way" is to build a
RAID10 array on top of regular EBS volumes, the "new way" is to use EBS
with Provisioned IOPS (possibly with RAID0).

> Just to set the scene -- the application is a very high traffic web
> service where any down time is very costly, processing a few hundred
> transactions/s.

What "high traffic" means for the database? Does that mean a lot of
reads or writes, or something else?

> Scanning through the latest list of AWS instance types, I can see two
> plausible approaches:
> 
> 1. High I/O Instances:  (regular AWS instance but with SSD local
> storage) + some form of replication. Replication would be needed because
> (as I understand it) any AWS instance can be "vanished" at any time due
> to Amazon screwing something up, maintenance on the host, etc (I believe
> the term of art is "ephemeral").

Yes. You'll get great I/O performance with these SSD-based instances
(easily ~1GB/s in), so you'll probably hit CPU bottlenecks instead.

You're right that to handle the instance / ephemeral failures, you'll
have to use some sort of replication - might be your custom
application-specific application, or some sort of built-in (async/sync
streamin, log shipping, Slony, Londiste, whatever suits your needs ...).

If you really value the availability, you should deploy the replica in
different availability zone or data center.

> 2. EBS-Optimized Instances: these allow the use of EBS storage (SAN-type
> service) from regular AWS instances. Assuming that EBS is maintained to
> a high level of availability and performance (it doesn't, afaik, feature
> the vanishing property of AWS machines), this should in theory work out
> much the same as a traditional cluster of physical machines using a
> shared SAN, with the appropriate voodoo to fail over between nodes.

No, that's not what "EBS Optimized" instances are for. All AWS instance
types can use EBS, using a SHARED network link. That means that e.g.
HTTP or SSH traffic influences EBS performance, because they use the
same ethernet link. The "EBS Optimized" says that the instance has a
network link dedicated for EBS traffic, with guaranteed throughput.

That is not going to fix the variability or EBS performance, though ...

What you're looking for is called "Provisioned IOPS" (PIOPS) which
guarantees the EBS volume performance, in terms of IOPS with 16kB block.
For example you may create an EBS volume with 2000 IOPS, which is
~32MB/s (with 16kB blocks). It's not much, but it's much easier to build
RAID0 array on top of those volumes. We're using this for some of our
databases and are very happy with it.

Obviously, you want to use PIOPS with EBS Optimized instances. I don't
see much point in using only one of them.

But still, depends on the required I/O performance - you can't really
get above 125MB/s (m2.4xlarge) or 250MB/s (cc2.8xlarge).

And you can't really rely on this if you need quick failover to a
different availability zone or data center, because it's quite likely
the EBS is going to be hit by the issue (read the analysis of AWS outage
from April 2011: http://aws.amazon.com/message/65648/).

> Any thoughts, wisdom, and especially from-the-trenches experience, would
> be appreciated.

My recommendation is to plan for zone/datacenter failures first. That
means build a failover replica in a different zone/datacenter.

You might be able to handle isolated EBS failures e.g. using snapshots
and/or backups and similar recovery procedures, but it may require
unpredictable downtimes (e.g. while we don't see failed EBS volumes very
frequently, we do see EBS volumes stuck in "attaching" much more
frequently).

To handle I/O performance, you may either use EBS with PIOPS (which will
also give you more reliability) or SSD instances (but you'll have to
either setup a local replica to handle the instance failures or do the
failover to the other cluster).

> In the Googlesphere I found this interesting presentation :
> http://www.pgcon.org/2012/schedule/attachments/256_pg-aws.pdf which
> appears to support option #2 with s/w (obviously) RAID on the PG hosts,
> but with replication rather than SAN cluster-style failover, or perhaps
> in addition to.

Christophe's talk is definitely a valuable source, although maybe a bit
difficult to follow without his comments (just like any other talk). And
I don't see any mention of "Provisioned IOPS" in it, probably as it was
prepared before Amazon started to offer that feature.

Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general