Re: requested shared memory size overflows size_t

Tom Wilcox <hungrytom@xxxxxxxxx> · Tue, 15 Jun 2010 02:12:31 +0100

Thanks a lot Dave,

That's exactly the kind of answer I can use to justify the OS switch. 
Motivation for the previous setup was based on the fact that we will use 
the same machine for other projects that will use SQL Server and most of 
our experience lies within the MS domain. However, these projects are 
not a high priority currently and therefore I have been focusing on the 
best solution for a Postgres-focused setup.

This does however mean that I will need to have the other projects 
running in a VM on Linux. However, they are less demanding in terms of 
resources.

Cheers,
Tom

Dave Crooke wrote:
With that clarification, I stand squarely behind what others are 
saying ... if performance is important to you, then you should always 
run databases on dedicated hardware, with the OS running on bare metal 
with no virtualization. VirtualBox has even more I/O losses than 
Hyper-V. It's simply not designed for this, and you're giving away a 
ton of performance.

If nothing else, my confusion should indicate to you how 
unconventional and poorly performing this virtualizaed setup is ... I 
simply assumed that the only plausible reason you were piggybacking on 
virtualization on Windows was a mandated lack of alternative options.

Reload the hardware with an OS which PGSQL supports well, and get rid 
of the VirtualBox and Windows layers. If you have hardware that only 
Windows supports well, then you may need to make some hardware changes.

I haven't said anything about which Unix-like OS .... you may find 
people arguing passionately for BSD vs. Linux .... however, the 
difference between these is negligible compared to "virtualized vs. 
real system", and at this point considerations like support base, ease 
of use and familiarity also come into play.

IMHO Ubuntu would be a fine choice, and PGSQL is a "first-class" 
supported package from the distributor ... however, at customer sites, 
I've typically used Red Hat AS because they have a corporate 
preference for it, even though it is less convenient to install and 
manage.

On Mon, Jun 14, 2010 at 7:41 PM, Tom Wilcox <hungrytom@xxxxxxxxx 
<mailto:hungrytom@xxxxxxxxx>> wrote:

    Hi Dave,

    I am definitely able to switch OS if it will get the most out of
    Postgres. So it is definitely a case of choosing the OS on the
    needs if the app providing it is well justified.

    Currently, we are running Ubuntu Server 64-bit in a VirtualBox VM.

    Cheers,
    Tom

    Dave Crooke wrote:

        Tom

        I always prefer to choose apps based on business needs, then
        the OS based on the needs for the app.

        Cynically, I often feel that the best answer to "we have a
        policy that says we're only allowed to use operating system x"
        is to ignore the policy .... the kind of people ignorant
        enough to be that blinkered are usually not tech-savvy enough
        to notice when it gets flouted :-)

        More seriously, is the policy "Windows only on the metal" or
        could you run e.g. VMware ESX server? I/O is the area that
        takes the biggest hit in virtualization, and ESX server has
        far less overhead loss than either Hyper-V (which I presume
        you are using) or VMWare Workstation for NT (kernels).

        If it's a Windows-only policy, then perhaps you can run those
        traps in reverse, and switch to a Windows database, i.e.
        Microsoft SQL Server.

        Cheers
        Dave

        On Mon, Jun 14, 2010 at 1:53 PM, Tom Wilcox
        <hungrytom@xxxxxxxxx <mailto:hungrytom@xxxxxxxxx>
        <mailto:hungrytom@xxxxxxxxx <mailto:hungrytom@xxxxxxxxx>>> wrote:

           Hi Bob,

           Thanks a lot. Here's my best attempt to answer your questions:

           The VM is setup with a virtual disk image dynamically
        expanding to
           fill an allocation of 300GB on a fast, local hard drive
        (avg read
           speed = 778MB/s ).
           WAL files can have their own disk, but how significantly would
           this affect our performance?
           The filesystem of the host OS is NTFS (Windows Server 2008
        OS 64),
           the guest filesystem is Ext2 (Ubuntu 64).
           The workload is OLAP (lots of large, complex queries on large
           tables run in sequence).

           In addition, I have reconfigured my server to use more memory.
           Here's a detailed blow by blow of how I reconfigured my
        system to
           get better performance (for anyone who might be interested)...

           In order to increase the shared memory on Ubuntu I edited the
           System V IPC values using sysctl:

           sysctl -w kernel.shmmax=16106127360*
           *sysctl -w kernel.shmall=2097152

           I had some fun with permissions as I somehow managed to
        change the
           owner  of the postgresql.conf to root where it needed to be
           postgres, resulting in failure to start the service..
        (Fixed with
           chown postgres:postgres ./data/postgresql.conf and chmod u=rwx
           ./data -R).

           I changed the following params in my configuration file..

           default_statistics_target=10000
           maintenance_work_mem=512MB
           work_mem=512MB
           shared_buffers=512MB
           wal_buffers=128MB

           With this config, the following command took  6,400,000ms:

           EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org;

           With plan:
           "Seq Scan on match_data  (cost=0.00..1392900.78 rows=32237278
           width=232) (actual time=0.379..464270.682 rows=27777961
        loops=1)"
           "Total runtime: 6398238.890 ms"

           With these changes to the previous config, the same command
        took
            5,610,000ms:

           maintenance_work_mem=4GB
           work_mem=4GB
           shared_buffers=4GB
           effective_cache_size=4GB
           wal_buffers=1GB

           Resulting plan:

           "Seq Scan on match_data  (cost=0.00..2340147.72 rows=30888572
           width=232) (actual time=0.094..452793.430 rows=27777961
        loops=1)"
           "Total runtime: 5614140.786 ms"

           Then I performed these changes to the postgresql.conf file:

           max_connections=3
           effective_cache_size=15GB
           maintenance_work_mem=5GB
           shared_buffers=7000MB
           work_mem=5GB

           And ran this query (for a quick look - can't afford the
        time for
           the previous tests..):

           EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE
           match_data_id < 100000;

           Result:

           "Index Scan using match_data_pkey1 on match_data
            (cost=0.00..15662.17 rows=4490 width=232) (actual
           time=27.055..1908.027 rows=99999 loops=1)"
           "  Index Cond: (match_data_id < 100000)"
           "Total runtime: 25909.372 ms"

           I then ran EntrepriseDB's Tuner on my postgres install (for a
           dedicated machine) and got the following settings and results:

           EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE
           match_data_id < 100000;

           "Index Scan using match_data_pkey1 on match_data
            (cost=0.00..13734.54 rows=4495 width=232) (actual
           time=0.348..2928.844 rows=99999 loops=1)"
           "  Index Cond: (match_data_id < 100000)"
           "Total runtime: 1066580.293 ms"

           For now, I will go with the config using 7000MB shared_buffers.
           Any suggestions on how I can further optimise this config for a
           single session, 64-bit install utilising ALL of 96GB RAM. I
        will
           spend the next week making the case for a native install of
        Linux,
           but first we need to be 100% sure that is the only way to
        get the
           most out of Postgres on this machine.

           Thanks very much. I now feel I am at a position where I can
        really
           explore and find the optimal configuration for my system, but
           would still appreciate any suggestions.

           Cheers,
           Tom

           On 11/06/2010 07:25, Bob Lunney wrote:

               Tom,

               First off, I wouldn't use a VM if I could help it, however,
               sometimes you have to make compromises.  With a 16 Gb
        machine
               running 64-bit Ubuntu and only PostgreSQL, I'd start by
               allocating 4 Gb to shared_buffers.  That should leave more
               than enough room for the OS and file system cache.
         Then I'd
               begin testing by measuring response times of representative
               queries with significant amounts of data.

               Also, what is the disk setup for the box?  Filesystem?  Can
               WAL files have their own disk?  Is the workload OLTP or
        OLAP,
               or a mixture of both?  There is more that goes into
        tuning a
               PG server for good performance than simply installing the
               software, setting a couple of GUCs and running it.

               Bob

               --- On Thu, 6/10/10, Tom Wilcox <hungrytom@xxxxxxxxx
        <mailto:hungrytom@xxxxxxxxx>
               <mailto:hungrytom@xxxxxxxxx
        <mailto:hungrytom@xxxxxxxxx>>> wrote:

                   From: Tom Wilcox <hungrytom@xxxxxxxxx
        <mailto:hungrytom@xxxxxxxxx>
                   <mailto:hungrytom@xxxxxxxxx
        <mailto:hungrytom@xxxxxxxxx>>>

                   Subject: Re:  requested shared memory size
                   overflows size_t
                   To: "Bob Lunney" <bob_lunney@xxxxxxxxx
        <mailto:bob_lunney@xxxxxxxxx>
                   <mailto:bob_lunney@xxxxxxxxx
        <mailto:bob_lunney@xxxxxxxxx>>>

                   Cc: "Robert Haas" <robertmhaas@xxxxxxxxx
        <mailto:robertmhaas@xxxxxxxxx>
                   <mailto:robertmhaas@xxxxxxxxx
        <mailto:robertmhaas@xxxxxxxxx>>>,
                   pgsql-performance@xxxxxxxxxxxxxx
        <mailto:pgsql-performance@xxxxxxxxxxxxxx>
                   <mailto:pgsql-performance@xxxxxxxxxxxxxx
        <mailto:pgsql-performance@xxxxxxxxxxxxxx>>

                   Date: Thursday, June 10, 2010, 10:45 AM
                   Thanks guys. I am currently
                   installing Pg64 onto a Ubuntu Server 64-bit
        installation
                   running as a VM in VirtualBox with 16GB of RAM
        accessible.
                   If what you say is true then what do you suggest I
        do to
                   configure my new setup to best use the available
        16GB (96GB
                   and native install eventually if the test goes
        well) of RAM
                   on Linux.

                   I was considering starting by using Enterprise DBs
        tuner to
                   see if that optimises things to a better quality..

                   Tom

                   On 10/06/2010 15:41, Bob Lunney wrote:

                       True, plus there are the other issues of increased

                   checkpoint times and I/O, bgwriter tuning, etc.  It may
                   be better to let the OS cache the files and size
                   shared_buffers to a smaller value.

                       Bob Lunney

                       --- On Wed, 6/9/10, Robert
        Haas<robertmhaas@xxxxxxxxx <mailto:robertmhaas@xxxxxxxxx>
                       <mailto:robertmhaas@xxxxxxxxx
        <mailto:robertmhaas@xxxxxxxxx>>>      

                   wrote:

                           From: Robert Haas<robertmhaas@xxxxxxxxx
        <mailto:robertmhaas@xxxxxxxxx>
                           <mailto:robertmhaas@xxxxxxxxx
        <mailto:robertmhaas@xxxxxxxxx>>>

                           Subject: Re:  requested shared memory

                   size overflows size_t

                           To: "Bob Lunney"<bob_lunney@xxxxxxxxx
        <mailto:bob_lunney@xxxxxxxxx>
                           <mailto:bob_lunney@xxxxxxxxx
        <mailto:bob_lunney@xxxxxxxxx>>>

                           Cc: pgsql-performance@xxxxxxxxxxxxxx
        <mailto:pgsql-performance@xxxxxxxxxxxxxx>
                           <mailto:pgsql-performance@xxxxxxxxxxxxxx
        <mailto:pgsql-performance@xxxxxxxxxxxxxx>>,

                   "Tom Wilcox"<hungrytom@xxxxxxxxxxxxxx
        <mailto:hungrytom@xxxxxxxxxxxxxx>
                   <mailto:hungrytom@xxxxxxxxxxxxxx
        <mailto:hungrytom@xxxxxxxxxxxxxx>>>

                           Date: Wednesday, June 9, 2010, 9:49 PM
                           On Wed, Jun 2, 2010 at 9:26 PM, Bob
                           Lunney<bob_lunney@xxxxxxxxx
        <mailto:bob_lunney@xxxxxxxxx>
                           <mailto:bob_lunney@xxxxxxxxx
        <mailto:bob_lunney@xxxxxxxxx>>>

                           wrote:

                               Your other option, of course, is a nice
        64-bit

                   linux

                           variant, which won't have this problem at all.

                           Although, even there, I think I've heard that

                   after 10GB

                           you don't get
                           much benefit from raising it further.  Not

                   sure if

                           that's accurate or
                           not...

                           -- Robert Haas
                           EnterpriseDB: http://www.enterprisedb.com
                           The Enterprise Postgres Company

           --     Sent via pgsql-performance mailing list
           (pgsql-performance@xxxxxxxxxxxxxx
        <mailto:pgsql-performance@xxxxxxxxxxxxxx>
           <mailto:pgsql-performance@xxxxxxxxxxxxxx
        <mailto:pgsql-performance@xxxxxxxxxxxxxx>>)

           To make changes to your subscription:
           http://www.postgresql.org/mailpref/pgsql-performance

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance