On 12/8/18 11:00, Square Bob wrote: > My apologies if this is off topic. The AWS Aurora PostgreSQL forums are also a great place to post questions like this https://forums.aws.amazon.com/forum.jspa?forumID=227 > Our company is moving to Aurora, In the past I would take care not to > allow postgresql to over-commit memory beyond the actual memory on the > server, which meant I would add the buffer pool + (work_mem * > max_connections) + (maintenance_work_mem * autovacuum threads) > > However as I look at the aroura defaults they are all off the charts, > for example, based on the calculations in the config (amazon doesn't > make it easy, some settings are in pages, some are in kb, some are who > knows what) I see the following settings as default in our aroura config: > > The instance size is db.r4.xlarge > this instance size is listed as having 30.5GB of ram > > Here's the default settings: > > shared_buffers: {DBInstanceClassMemory/10922} > which equates to 24GB On RDS PostgreSQL, the default is 25% of your server memory. This seems to be pretty widely accepted as a good starting point on PostgreSQL. But remember that in open source PostgreSQL on linux, all I/O goes through the filesystem and kernel buffer cache so in reality any available memory on the box is used for cache. Unlike normal PostgreSQL, Aurora does not do I/O through the linux buffer cache. If the default was left at 25% then this would result in very surprising performance for most people. On other databases where direct I/O is the normal pattern, 75% of memory on the box is often cited as a good starting point for OLTP systems. This default used on Aurora. > work_mem: 64000 (kb) > which equates to 65.5MB At present, this has been left at the community default for both RDS and Aurora PostgreSQL. > maintenance_work_mem: GREATEST({DBInstanceClassMemory/63963136*1024},65536) > which equates to 4.2GB This formula will set maint_work_mem to 1.639% of the memory on the system. It should be 511MB on an instance with 30.5GB of memory. > max_connections: LEAST({DBInstanceClassMemory/9531392},5000) > which equates to 3,380 On both RDS PostgreSQL and Aurora, max_connections is set to a value that's conservatively high. While the default setting here won't stop you, an r4.xlarge has only two physical CPUs and it's probably not a good idea to run with 3000 connections. Connection management is a common challenge with databases of all flavors. The right number is incredibly workload dependent and I'm not sure whether it's possible to have a truly meaningful default limit as a formula of the server type. > According to my math (If I got it right) in a worst case scenario, > if we maxed out max_connections, work_mem and maintenance_work_mem limits > the db would request 247GB of memory It's not quite this straightforward. First of all, work_mem is per plan node and it's only a guidance for where things should spill to disk. It doesn't completely prevent runaway memory usage by queries. Many queries don't need much work_mem at all, and many other queries use more memory than work_mem. Secondly, IIRC, autovacuum actually has a hard-coded artificial 1GB limit regardless of your maint_work_mem. However operations like index creation can in fact use all of maint_work_mem. > Additionally amazon has set effective_cache_size = > {DBInstanceClassMemory/10922} > > which equates to about 2.9MB (which given the other outlandish setting > may be the only appropriate setting in the system) That's actually the same as shared_buffers - 75% of the memory on the server. And remember this is a planner/costing parameter; it has nothing to do with allocating actual memory. > What the hell is amazon doing here? Am I missing the boat on tuning > postgresql memory? Is amazon simply counting on the bet that users will > never fully utilize an instance? Memory management is hard. Nevermind PostgreSQL - it's hard to even get a clear picture of what happens in the Linux kernel with regard to memory. Think about these two questions: (1) Is memory pressure slowing me down? (2) Is memory pressure causing any risk or danger to the system? I've heard of issues even with the new MemAvailable value that was added to /proc/meminfo - it seems difficult to get an accurate picture. While over-subscription might sound bad, you probably don't want to just disable swap completely either. There are usually pages sitting in memory that are completely unnecessary. I'm not going to claim the RDS defaults are perfect - in fact I'd love to hear ideas about how they could be improved. [Hopefully without starting any religious wars...] But I hope I've shown here that they aren't as completely crazy as they first appeared? :) -Jeremy -- Jeremy Schneider Database Engineer Amazon Web Services