Search Postgresql Archives

Re: Question about memory usage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 10, 2014 at 11:09 AM, Steve Atkins <steve@xxxxxxxxxxx> wrote:

On Jan 10, 2014, at 8:35 AM, Preston Hagar <prestonh@xxxxxxxxx> wrote:

> tl;dr: Moved from 8.3 to 9.3 and are now getting out of memory errors despite the server now having 32 GB instead of 4 GB of RAM and the workload and number of clients remaining the same.
>
>
> Details:
>
> We have been using Postgresql for some time internally with much success.  Recently, we completed a migration off of an older server running 8.3 to a new server running 9.3.  The older server had 4GB of RAM, the new server has 32 GB.
>
> For some reason, since migrating we are getting lots of "out of memory" and "cannot allocate memory" errors on the new server when the server gets under a decent load.  We have upped shmmax to 17179869184 and shmall to 4194304.

What are the exact error messages you’re getting, and where are you seeing them?


Thanks for the reply.  I'm seeing them in the main postgresql log (/var/log/postgresql/postgresql-9.3-main.log)

Here are a couple of examples from the incident we had this morning:

2014-01-10 06:14:40 CST  30176    LOG:  could not fork new process for connection: Cannot allocate memory
2014-01-10 06:14:40 CST  30176    LOG:  could not fork new process for connection: Cannot allocate memory


TopMemoryContext: 160408 total in 19 blocks; 12984 free (41 chunks); 147424 used
  TopTransactionContext: 8192 total in 1 blocks; 7392 free (0 chunks); 800 used
  Btree proof lookup cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  TableSpace cache: 8192 total in 1 blocks; 3216 free (0 chunks); 4976 used
  Type information cache: 24240 total in 2 blocks; 3744 free (0 chunks); 20496 used
  Operator lookup cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used
  MessageContext: 524288 total in 7 blocks; 225240 free (3 chunks); 299048 used
  Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  smgr relation table: 57344 total in 3 blocks; 34320 free (10 chunks); 23024 used
  TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks); 32 used
  Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used
    PortalHeapMemory: 1024 total in 1 blocks; 768 free (0 chunks); 256 used
      ExecutorState: 329080 total in 8 blocks; 105944 free (4 chunks); 223136 used
        TupleSort: 32816 total in 2 blocks; 176 free (2 chunks); 32640 used
        printtup: 0 total in 0 blocks; 0 free (0 chunks); 0 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
        ExprContext: 8192 total in 1 blocks; 8128 free (3 chunks); 64 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
        ExprContext: 8192 total in 1 blocks; 7896 free (0 chunks); 296 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
  Relcache by OID: 24576 total in 2 blocks; 11792 free (3 chunks); 12784 used
  CacheMemoryContext: 9815680 total in 25 blocks; 8143416 free (1 chunks); 1672264 used
    pg_toast_3598032_index: 1024 total in 1 blocks; 16 free (0 chunks); 1008 used
    mls_office_uid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    staff_primary: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    staff_desk_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    person_primary: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    person_person_workphone_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    person_person_mobilephone_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    person_person_lastname_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    person_person_homephone_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    person_person_homeofficephone_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    person_person_firstname_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    agent_primary: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    agent_membertype_id: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    agent_floydtraining_id: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    agent_agent_sync_realtorid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    agent_agent_sync_oagentid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    agent_agent_personid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    agent_agent_license_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_toast_2619_index: 1024 total in 1 blocks; 16 free (0 chunks); 1008 used
    mls_agent_uid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_uidprp_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_uid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_streetnum_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_streetname: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_sqfttotal_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_ownername_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_officelist_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_modified: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_mlsnumber_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_liststatus_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_listprice_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_city_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_beds_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_res_bathsfull_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_subdivision: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_zipcode: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_uidprp: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_uid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_priority: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_officelist: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_modified: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_mlsnum: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_longitude_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_liststatus: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_listprice: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_latitude_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_ishud_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_city_upcase_idx: 3072 total in 2 blocks; 1848 free (0 chunks); 1224 used
    mls_prp_city: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_agentlist2: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_prp_agentlist: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_media_uid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_media_tableuid: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_media_mediasource_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    mls_media_custompic_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    listing_primary: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    listing_listing_sync_ilresid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    listing_listing_streetnumber_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    listing_listing_streetname_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    listing_listing_mlsnumber_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    listing_listing_longitude_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    listing_listing_listingid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    listing_listing_latitude_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    listing_listing_brokerageid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    listing_listing_agentid_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    listing_listdate_idx: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    code_primary: 1024 total in 1 blocks; 16 free (0 chunks); 1008 used
    class_primary: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_index_indrelid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_attrdef_adrelid_adnum_index: 1024 total in 1 blocks; 16 free (0 chunks); 1008 used
    pg_db_role_setting_databaseid_rol_index: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
    pg_opclass_am_name_nsp_index: 3072 total in 2 blocks; 1736 free (2 chunks); 1336 used
    pg_foreign_data_wrapper_name_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_enum_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_class_relname_nsp_index: 1024 total in 1 blocks; 16 free (0 chunks); 1008 used
    pg_foreign_server_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_statistic_relid_att_inh_index: 3072 total in 2 blocks; 1736 free (2 chunks); 1336 used
    pg_cast_source_target_index: 1024 total in 1 blocks; 16 free (0 chunks); 1008 used
    pg_language_name_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_collation_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_amop_fam_strat_index: 3072 total in 2 blocks; 1736 free (2 chunks); 1336 used
    pg_index_indexrelid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_ts_template_tmplname_index: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
    pg_ts_config_map_index: 3072 total in 2 blocks; 1784 free (2 chunks); 1288 used
    pg_opclass_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_foreign_data_wrapper_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_event_trigger_evtname_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_ts_dict_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_event_trigger_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_conversion_default_index: 3072 total in 2 blocks; 1784 free (2 chunks); 1288 used
    pg_operator_oprname_l_r_n_index: 3072 total in 2 blocks; 1736 free (2 chunks); 1336 used
    pg_trigger_tgrelid_tgname_index: 1024 total in 1 blocks; 16 free (0 chunks); 1008 used
    pg_enum_typid_label_index: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
    pg_ts_config_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_user_mapping_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_opfamily_am_name_nsp_index: 3072 total in 2 blocks; 1784 free (2 chunks); 1288 used
    pg_foreign_table_relid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_type_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_aggregate_fnoid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_constraint_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_rewrite_rel_rulename_index: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
    pg_ts_parser_prsname_index: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
    pg_ts_config_cfgname_index: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
    pg_ts_parser_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_operator_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_namespace_nspname_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_ts_template_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_amop_opr_fam_index: 3072 total in 2 blocks; 1736 free (2 chunks); 1336 used
    pg_default_acl_role_nsp_obj_index: 3072 total in 2 blocks; 1784 free (2 chunks); 1288 used
    pg_collation_name_enc_nsp_index: 3072 total in 2 blocks; 1784 free (2 chunks); 1288 used
    pg_range_rngtypid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_ts_dict_dictname_index: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
    pg_type_typname_nsp_index: 1024 total in 1 blocks; 16 free (0 chunks); 1008 used
    pg_opfamily_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_class_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_proc_proname_args_nsp_index: 3072 total in 2 blocks; 1736 free (2 chunks); 1336 used
    pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 16 free (0 chunks); 1008 used
    pg_proc_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_language_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_namespace_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_amproc_fam_proc_index: 3072 total in 2 blocks; 1736 free (2 chunks); 1336 used
    pg_foreign_server_name_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_attribute_relid_attnam_index: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
    pg_conversion_oid_index: 1024 total in 1 blocks; 200 free (0 chunks); 824 used
    pg_user_mapping_user_server_index: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
    pg_conversion_name_nsp_index: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
    pg_authid_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_auth_members_member_role_index: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
    pg_tablespace_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_database_datname_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_auth_members_role_member_index: 1024 total in 1 blocks; 64 free (0 chunks); 960 used
    pg_database_oid_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
    pg_authid_rolname_index: 1024 total in 1 blocks; 152 free (0 chunks); 872 used
  MdSmgr: 8192 total in 1 blocks; 5440 free (0 chunks); 2752 used
  ident parser context: 0 total in 0 blocks; 0 free (0 chunks); 0 used
  hba parser context: 15360 total in 4 blocks; 6544 free (2 chunks); 8816 used
  LOCALLOCK hash: 24576 total in 2 blocks; 13920 free (4 chunks); 10656 used
  Timezones: 83472 total in 2 blocks; 3744 free (0 chunks); 79728 used
  ErrorContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
2014-01-10 06:18:46 CST 10.1.1.6 16669 [unknown] production  10.1.1.6(36680)ERROR:  out of memory
2014-01-10 06:18:46 CST 10.1.1.6 16669 [unknown] production  10.1.1.6(36680)DETAIL:  Failed on request of size 500.



After some of the connections cleared up, (we load a lot of data around that time), things got "back to normal".  Our Nagios monitoring showed a 30 minute window or so when no new connections could be made and the log is filled with more or less the same lines.


 
>
> We had originally copied our shared_buffers, work_mem, wal_buffers and other similar settings from our old config, but after getting the memory errors have tweaked them to the following:
>
> shared_buffers            = 7680MB
> temp_buffers              = 12MB
> max_prepared_transactions = 0
> work_mem                  = 80MB
> maintenance_work_mem      = 1GB
> wal_buffers = 8MB
> max_connections = 350
>
> The current settings seem to have helped, but we are still occasionally getting the errors.
>
> The weird thing is that our old server had 1/8th the RAM, was set to max_connections = 600 and had the same clients connecting in the same way to the same databases and we never saw any errors like this in the several years we have been using it.
>
> One issue I could see is that one of our main applications that connects to the database, opens a connection on startup, holds it open the entire time it is running, and doesn't close it until the app is closed.  In daily usage, for much of our staff it is opened first thing in the morning and left open all day (meaning the connection is held open for 8+ hours).  This was never an issue with 8.3, but I know it isn't a "best practice" in general.

That’s probably not related to the problems you’re seeing - I have apps that hold a connection to the database open for years. As long as it doesn’t keep a transaction open for a long time, you’re fine.


Good to know, thanks.

 
>
> We are working to update our application to be able to use pgbouncer with transaction connections to try to alleviate the long held connections, but it will take some time.

Using pgbouncer is probably a good idea - to reduce the number of concurrent connections, rather than the length of connections, though.


I'm testing a updated version of our application against pgbouncer now so hopefully that will help some.

 
>
> In the meantime, is there some other major difference or setting in 9.3 that we should look out for that could be causing this?  Like I said, the same database with the same load and number of clients has been running on a 8.3 install for years (pretty much since 2008 when 8.3 was released) with lesser hardware with no issues.
>
> Let me know if any other information would help out or if anyone has suggestions of things to check.

Cheers,
  Steve



Thanks for your help.

Preston

 

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux